Product Attributes Indexing Error – Integrity Constraint Violation

 

I recently came across an error while working on a client’s site that took me a bit to track down. When trying to index the site through the admin, I was presented with the rather unhelpful error of ‘There was a problem with reindexing process.’ After a good bit of digging, I was able to ultimately find out that it was due to an error in the Import file. Within the imported file was an attribute that was a multiselect type. For several of the products, the same attribute value was applied to this field more than once.

To figure out what the real error was, I used the indexer shell script.

Magento comes with a script in ./shell/ called indexer.php. This file is generally meant to be ran under a cronjob to force the indexing of the site. By SSHing into the server, changing to the shell folder, and the running ‘$ php indexer.php -reindexall’, I was presented with the real error (You can also just run the individual indexer, but I don’t remember the ident to do so). The error being given was ‘Integrity constraint violation: 1026 Duplicate entry ‘XXXX-XXXX-X-XXX’ for key ‘PRIMARY”. This too is a rather unhelpful error.

That said, I did some further digging to figure out what the actual SQL command was that was causing the error. The Error dump was truncating it before any good info was given. Fortunately though it gave me enough information to know how to get the SQL. I went to the file specified, /lib/Varien/Db/Statement/Pdo/Mysql.php and modified the file slightly such that I could get the actual SQL. For those curious, I modified the _prepare method to save the $sql to memory and then just dumped this in the try/catch that was throwing the error.

This led me to the the full SQL of INSERT INTO catalog_product_index_eav ((…)). After a bit of googling, I learned that saving the product in the admin that was specified by the violation (the ???? of ‘????-XXXX-X-XXX’), the error would go away. I tried this and it did…sorta. This just went on to another product. I didn’t have the time to keep saving each product, so I kept digging.

With my experience in Magento, I knew at this point it had to be due to something being wrong in the import file. After obtaining a copy of the file, I wasn’t able to immediately catch what was wrong with it (Just blind I guess – A lot of products with a lot of complex data), so I went back to the database, as well as the code, to obtain the answer.

The catalog_product_index_eav table is comprised of 4 columns: entity_id, attribute_id, store_id, and value. If you have spent as much time as I have in the database, these fields are obvious. The entity_id field is the master id of the product, going back to catalog_product_entity. The attribute_id field relates back to the EAV system and the eav_attribute table. The store_id goes to core_store and finally, value goes to eav_attribute_option_value. In case you are unaware, the EAV System is quite complex. In this case, eav_attribute_option_value maps by option_id to eav_attrubute_option, which is the value in catalog_product_index_eav. This maps to eav_attribute on the attribute_id field. With this info, I did some quick mapping.

At the same time, I was also looking in the code. Again with my experience in Magento, I was looking at the file /app/code/core/CatalogIndex/Model/Indexer/Eav.php. Reviewing the createIndexData method, I noticed that it was doing some special treatment to the multiselect input type. It was splitting the value on the comma and then looping to add the additional values.

Armed with this knowledge, I took a look at catalog_product_entity_varchar, where the values for the multiselect are stored. There, after locating the attribute record in question, I found it to be a string of comma separated numbers, of which one of the numbers was listed twice.

I mapped this number back to eav_attribute_option_value to get the actual text of the attribute. I then checked the import file again and found that the first complex data record and the last were the same. So, after updating the import, go figure the errors would go away.

This little adventure does bring into light a couple of issues. While the root of the issue was the import file, it does point out two other spots that could be addressed as well. The first is the indexer. The indexer could easily be updated such that no duplicates would be added. This however, is more of a cover up than a fix. It prevents the error from occurring, but doesn’t fix the reason the error was occurring in the first place, this being the attribute value id specified twice in the value field. The reason the admin saving fixes the issue is because it covers up this and re saves it so that the attribute id is only there once. The second fix is in the importer. This is a more permanent fix that isn’t a cover up. The importer should be updated so that when the import runs, any complex data is verified for duplication. This may be a bit difficult to come by, but would be the better fix to insure that the data entering the database is accurate to begin with.

 


If you found this article useful, please consider donating to show your support. Need help with something? Feel free to Drop-A-Line and we will try to assist where we can.

 
This entry was posted in Creations, Magento and tagged , , . Bookmark the permalink.