Blog Entry 10 years, 5 months ago

QuerySet count() and MySQL

One thing that I just found out yesterday is that duplicated results of QuerySet exist. That is also mentioned inside the Django document but I saw that only after I knew it. That could be acceptable, and easy to deal with in case of getting normal results (and they waste resources). But that is an issue if you want to get the count of results

One thing that I just found out yesterday is that duplicated results of QuerySet exist. That is also mentioned inside the Django document but I saw that only after I knew it. That could be acceptable, and easy to deal with in case of getting normal results (and they waste resources). But that is an issue if you want to get the count of results (using .count()), especially when there are OR in query condition. Django only returns the sum of all separate OR cases. That's bad. 

Let take an example:

We have: 1 bike in RED, 1 bike in GREEN and RED. Then if you query to count bike having:
>> RED color, it returns: 2 (perfect!)
>> GREEN color, it returns: 1 (perfect!)
>> RED or GREEN color, it returns: 2 + 1 = 3 (not good)

Here is kind of SQL it makes:

SELECT COUNT(*) FROM `vpr_content_material` INNER JOIN `vpr_content_material_categories` ON (`vpr_content_material`.`id` = `vpr_content_material_categories`.`material_id`) WHERE (`vpr_content_material_categories`.`category_id` LIKE BINARY '%1%' OR `vpr_content_material_categories`.`category_id` LIKE BINARY '%5%' )

Follow the official Django document, you can use distinct to avoid redundancy in data, but some back-end DBs don't support distinct on the whole record (all fields), e.g. MySQL.

Although I will try to modify the distinct call to get it work with MySQL, I'm still going to do the DB switch.

 

Recent Reads