Re: [PHP-DB] Duplicate rows
SELECT DISTINCT * FROM `tablename` On Wednesday 01 March 2006 7:24 am, Miguel Guirao wrote: > My dear beloved friends, > > I have a catalog of products that a product provider gave, sadly for me, in > this CSV file there are many duplicated rows. > I edited the file in my Linux system with the "uniq -u" command, and it > worked somewhat fine, it eliminated some duplicated rows, originally the > file had 24K rows, and now it has been reduced to 15k rows. > > Anyhow, there are still duplicated rows, and since this is a catalog, it > should not have duplicated rows!!! > Now the catalog has been has been loaded into the DB. > > How can I continue eliminating duplicated rows? > As far as I remember the is a sentence in SQL to only show ONE row of > duplicated rows, maybe if I do a select using this sentence and then put > this new recordset in another table, it will work!! > > Any ideas? > > --- > Miguel Guirao Aguilera > Logistica R8 TELCEL > Tel. (999) 960.7994 > > > Este mensaje es exclusivamente para el uso de la persona o entidad a quien > esta dirigido; contiene informacion estrictamente confidencial y legalmente > protegida, cuya divulgacion es sancionada por la ley. Si el lector de este > mensaje no es a quien esta dirigido, ni se trata del empleado o agente > responsable de esta informacion, se le notifica por medio del presente, que > su reproduccion y distribucion, esta estrictamente prohibida. Si Usted > recibio este comunicado por error, favor de notificarlo inmediatamente al > remitente y destruir el mensaje. Todas las opiniones contenidas en este > mail son propias del autor del mensaje y no necesariamente coinciden con > las de Radiomovil Dipsa, S.A. de C.V. o alguna de sus empresas controladas, > controladoras, afiliadas y subsidiarias. Este mensaje intencionalmente no > contiene acentos. > > This message is for the sole use of the person or entity to whom it is > being sent. Therefore, it contains strictly confidential and legally > protected material whose disclosure is subject to penalty by law. If the > person reading this message is not the one to whom it is being sent and/or > is not an employee or the responsible agent for this information, this > person is herein notified that any unauthorized dissemination, distribution > or copying of the materials included in this facsimile is strictly > prohibited. If you received this document by mistake please notify > immediately to the subscriber and destroy the message. Any opinions > contained in this e-mail are those of the author of the message and do not > necessarily coincide with those of Radiomovil Dipsa, S.A. de C.V. or any of > its control, controlled, affiliates and subsidiaries companies. No part of > this message or attachments may be used or reproduced in any manner > whatsoever. -- PHP Database Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DB] Duplicate rows
Depends on how you determine if something's a duplicate or not. For example, if it's just one column that can be used, you can do something like this: select ItemName, count(ItemName) from ItemListTable group by ItemName having count(ItemName) > 1 That'll show you if "ItemName" is repeated. Then you can go back through and just search for "ItemName" and remove the ones you don't want. You can do pretty much the same thing as above but CONCATenating multple columns if that's what you need to do to determine uniqueness. I know you're dealing with 15k rows still, so you probably want something a little more automated. Without more info though, it's hard to say exactly what can be done. Hope that helps a little bit. -TG = = = Original message = = = My dear beloved friends, I have a catalog of products that a product provider gave, sadly for me, in this CSV file there are many duplicated rows. I edited the file in my Linux system with the "uniq -u" command, and it worked somewhat fine, it eliminated some duplicated rows, originally the file had 24K rows, and now it has been reduced to 15k rows. Anyhow, there are still duplicated rows, and since this is a catalog, it should not have duplicated rows!!! Now the catalog has been has been loaded into the DB. How can I continue eliminating duplicated rows? As far as I remember the is a sentence in SQL to only show ONE row of duplicated rows, maybe if I do a select using this sentence and then put this new recordset in another table, it will work!! Any ideas? --- Miguel Guirao Aguilera Logistica R8 TELCEL Tel. (999) 960.7994 Este mensaje es exclusivamente para el uso de la persona o entidad a quien esta dirigido; contiene informacion estrictamente confidencial y legalmente protegida, cuya divulgacion es sancionada por la ley. Si el lector de este mensaje no es a quien esta dirigido, ni se trata del empleado o agente responsable de esta informacion, se le notifica por medio del presente, que su reproduccion y distribucion, esta estrictamente prohibida. Si Usted recibio este comunicado por error, favor de notificarlo inmediatamente al remitente y destruir el mensaje. Todas las opiniones contenidas en este mail son propias del autor del mensaje y no necesariamente coinciden con las de Radiomovil Dipsa, S.A. de C.V. o alguna de sus empresas controladas, controladoras, afiliadas y subsidiarias. Este mensaje intencionalmente no contiene acentos. This message is for the sole use of the person or entity to whom it is being sent. Therefore, it contains strictly confidential and legally protected material whose disclosure is subject to penalty by law. If the person reading this message is not the one to whom it is being sent and/or is not an employee or the responsible agent for this information, this person is herein notified that any unauthorized dissemination, distribution or copying of the materials included in this facsimile is strictly prohibited. If you received this document by mistake please notify immediately to the subscriber and destroy the message. Any opinions contained in this e-mail are those of the author of the message and do not necessarily coincide with those of Radiomovil Dipsa, S.A. de C.V. or any of its control, controlled, affiliates and subsidiaries companies. No part of this message or attachments may be used or reproduced in any manner whatsoever. ___ Sent by ePrompter, the premier email notification software. Free download at http://www.ePrompter.com. -- PHP Database Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DB] Duplicate rows
Assuming you're using MySQL, instead of using INSERT INTO, you can use REPLACE INTO instead. If you have unique keys on that table, the new record will overwrite existing records with the same unique keys instead of creating a new one. http://dev.mysql.com/doc/refman/5.0/en/replace.html --Ade. Miguel Guirao wrote: > My dear beloved friends, > > I have a catalog of products that a product provider gave, sadly for me, in > this CSV file there are many duplicated rows. > I edited the file in my Linux system with the "uniq -u" command, and it > worked somewhat fine, it eliminated some duplicated rows, originally the > file had 24K rows, and now it has been reduced to 15k rows. > > Anyhow, there are still duplicated rows, and since this is a catalog, it > should not have duplicated rows!!! > Now the catalog has been has been loaded into the DB. > > How can I continue eliminating duplicated rows? > As far as I remember the is a sentence in SQL to only show ONE row of > duplicated rows, maybe if I do a select using this sentence and then put > this new recordset in another table, it will work!! > > Any ideas? > > --- > Miguel Guirao Aguilera > Logistica R8 TELCEL > Tel. (999) 960.7994 > > > Este mensaje es exclusivamente para el uso de la persona o entidad a quien > esta dirigido; contiene informacion estrictamente confidencial y legalmente > protegida, cuya divulgacion es sancionada por la ley. Si el lector de este > mensaje no es a quien esta dirigido, ni se trata del empleado o agente > responsable de esta informacion, se le notifica por medio del presente, que > su reproduccion y distribucion, esta estrictamente prohibida. Si Usted > recibio este comunicado por error, favor de notificarlo inmediatamente al > remitente y destruir el mensaje. Todas las opiniones contenidas en este mail > son propias del autor del mensaje y no necesariamente coinciden con las de > Radiomovil Dipsa, S.A. de C.V. o alguna de sus empresas controladas, > controladoras, afiliadas y subsidiarias. Este mensaje intencionalmente no > contiene acentos. > > This message is for the sole use of the person or entity to whom it is being > sent. Therefore, it contains strictly confidential and legally protected > material whose disclosure is subject to penalty by law. If the person > reading this message is not the one to whom it is being sent and/or is not an > employee or the responsible agent for this information, this person is herein > notified that any unauthorized dissemination, distribution or copying of the > materials included in this facsimile is strictly prohibited. If you received > this document by mistake please notify immediately to the subscriber and > destroy the message. Any opinions contained in this e-mail are those of the > author of the message and do not necessarily coincide with those of > Radiomovil Dipsa, S.A. de C.V. or any of its control, controlled, affiliates > and subsidiaries companies. No part of this message or attachments may be > used or reproduced in any manner whatsoever. > > -- Ade Olonoh - Independent Software Developer http://ade.olonoh.com | http://blog.olonoh.com -- PHP Database Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DB] Duplicate rows
Haha.. oh yeah.. DISTINCT works too.. in this case you'd get a list of all totally 100% unique records. If you had an auto_increment column though, you'd want to exclude it from the list. -TG = = = Original message = = = SELECT DISTINCT * FROM `tablename` On Wednesday 01 March 2006 7:24 am, Miguel Guirao wrote: > My dear beloved friends, > > I have a catalog of products that a product provider gave, sadly for me, in > this CSV file there are many duplicated rows. > I edited the file in my Linux system with the "uniq -u" command, and it > worked somewhat fine, it eliminated some duplicated rows, originally the > file had 24K rows, and now it has been reduced to 15k rows. > > Anyhow, there are still duplicated rows, and since this is a catalog, it > should not have duplicated rows!!! > Now the catalog has been has been loaded into the DB. > > How can I continue eliminating duplicated rows? > As far as I remember the is a sentence in SQL to only show ONE row of > duplicated rows, maybe if I do a select using this sentence and then put > this new recordset in another table, it will work!! > > Any ideas? > > --- > Miguel Guirao Aguilera > Logistica R8 TELCEL > Tel. (999) 960.7994 > > > Este mensaje es exclusivamente para el uso de la persona o entidad a quien > esta dirigido; contiene informacion estrictamente confidencial y legalmente > protegida, cuya divulgacion es sancionada por la ley. Si el lector de este > mensaje no es a quien esta dirigido, ni se trata del empleado o agente > responsable de esta informacion, se le notifica por medio del presente, que > su reproduccion y distribucion, esta estrictamente prohibida. Si Usted > recibio este comunicado por error, favor de notificarlo inmediatamente al > remitente y destruir el mensaje. Todas las opiniones contenidas en este > mail son propias del autor del mensaje y no necesariamente coinciden con > las de Radiomovil Dipsa, S.A. de C.V. o alguna de sus empresas controladas, > controladoras, afiliadas y subsidiarias. Este mensaje intencionalmente no > contiene acentos. > > This message is for the sole use of the person or entity to whom it is > being sent. Therefore, it contains strictly confidential and legally > protected material whose disclosure is subject to penalty by law. If the > person reading this message is not the one to whom it is being sent and/or > is not an employee or the responsible agent for this information, this > person is herein notified that any unauthorized dissemination, distribution > or copying of the materials included in this facsimile is strictly > prohibited. If you received this document by mistake please notify > immediately to the subscriber and destroy the message. Any opinions > contained in this e-mail are those of the author of the message and do not > necessarily coincide with those of Radiomovil Dipsa, S.A. de C.V. or any of > its control, controlled, affiliates and subsidiaries companies. No part of > this message or attachments may be used or reproduced in any manner > whatsoever. ___ Sent by ePrompter, the premier email notification software. Free download at http://www.ePrompter.com. -- PHP Database Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DB] Duplicate rows
Ahh, good point, yes, keep in mind you may have some index rows.. On Wednesday 01 March 2006 10:18 am, [EMAIL PROTECTED] wrote: > Haha.. oh yeah.. DISTINCT works too.. in this case you'd get a list of all > totally 100% unique records. > > If you had an auto_increment column though, you'd want to exclude it from > the list. > > -TG > > = = = Original message = = = > > SELECT DISTINCT * FROM `tablename` > > On Wednesday 01 March 2006 7:24 am, Miguel Guirao wrote: > > My dear beloved friends, > > > > I have a catalog of products that a product provider gave, sadly for me, > > in this CSV file there are many duplicated rows. > > I edited the file in my Linux system with the "uniq -u" command, and it > > worked somewhat fine, it eliminated some duplicated rows, originally the > > file had 24K rows, and now it has been reduced to 15k rows. > > > > Anyhow, there are still duplicated rows, and since this is a catalog, it > > should not have duplicated rows!!! > > Now the catalog has been has been loaded into the DB. > > > > How can I continue eliminating duplicated rows? > > As far as I remember the is a sentence in SQL to only show ONE row of > > duplicated rows, maybe if I do a select using this sentence and then put > > this new recordset in another table, it will work!! > > > > Any ideas? > > > > --- > > Miguel Guirao Aguilera > > Logistica R8 TELCEL > > Tel. (999) 960.7994 > > > > > > Este mensaje es exclusivamente para el uso de la persona o entidad a > > quien esta dirigido; contiene informacion estrictamente confidencial y > > legalmente protegida, cuya divulgacion es sancionada por la ley. Si el > > lector de este mensaje no es a quien esta dirigido, ni se trata del > > empleado o agente responsable de esta informacion, se le notifica por > > medio del presente, que su reproduccion y distribucion, esta > > estrictamente prohibida. Si Usted recibio este comunicado por error, > > favor de notificarlo inmediatamente al remitente y destruir el mensaje. > > Todas las opiniones contenidas en este mail son propias del autor del > > mensaje y no necesariamente coinciden con las de Radiomovil Dipsa, S.A. > > de C.V. o alguna de sus empresas controladas, controladoras, afiliadas y > > subsidiarias. Este mensaje intencionalmente no contiene acentos. > > > > This message is for the sole use of the person or entity to whom it is > > being sent. Therefore, it contains strictly confidential and legally > > protected material whose disclosure is subject to penalty by law. If the > > person reading this message is not the one to whom it is being sent > > and/or is not an employee or the responsible agent for this information, > > this person is herein notified that any unauthorized dissemination, > > distribution or copying of the materials included in this facsimile is > > strictly prohibited. If you received this document by mistake please > > notify immediately to the subscriber and destroy the message. Any > > opinions contained in this e-mail are those of the author of the message > > and do not necessarily coincide with those of Radiomovil Dipsa, S.A. de > > C.V. or any of its control, controlled, affiliates and subsidiaries > > companies. No part of this message or attachments may be used or > > reproduced in any manner whatsoever. > > ___ > Sent by ePrompter, the premier email notification software. > Free download at http://www.ePrompter.com. -- PHP Database Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP-DB] Duplicate rows
err columns.. sorry.. On Wednesday 01 March 2006 10:45 am, Micah Stevens wrote: > Ahh, good point, yes, keep in mind you may have some index rows.. > > On Wednesday 01 March 2006 10:18 am, [EMAIL PROTECTED] wrote: > > Haha.. oh yeah.. DISTINCT works too.. in this case you'd get a list of > > all totally 100% unique records. > > > > If you had an auto_increment column though, you'd want to exclude it from > > the list. > > > > -TG > > > > = = = Original message = = = > > > > SELECT DISTINCT * FROM `tablename` > > > > On Wednesday 01 March 2006 7:24 am, Miguel Guirao wrote: > > > My dear beloved friends, > > > > > > I have a catalog of products that a product provider gave, sadly for > > > me, in this CSV file there are many duplicated rows. > > > I edited the file in my Linux system with the "uniq -u" command, and it > > > worked somewhat fine, it eliminated some duplicated rows, originally > > > the file had 24K rows, and now it has been reduced to 15k rows. > > > > > > Anyhow, there are still duplicated rows, and since this is a catalog, > > > it should not have duplicated rows!!! > > > Now the catalog has been has been loaded into the DB. > > > > > > How can I continue eliminating duplicated rows? > > > As far as I remember the is a sentence in SQL to only show ONE row of > > > duplicated rows, maybe if I do a select using this sentence and then > > > put this new recordset in another table, it will work!! > > > > > > Any ideas? > > > > > > --- > > > Miguel Guirao Aguilera > > > Logistica R8 TELCEL > > > Tel. (999) 960.7994 > > > > > > > > > Este mensaje es exclusivamente para el uso de la persona o entidad a > > > quien esta dirigido; contiene informacion estrictamente confidencial y > > > legalmente protegida, cuya divulgacion es sancionada por la ley. Si el > > > lector de este mensaje no es a quien esta dirigido, ni se trata del > > > empleado o agente responsable de esta informacion, se le notifica por > > > medio del presente, que su reproduccion y distribucion, esta > > > estrictamente prohibida. Si Usted recibio este comunicado por error, > > > favor de notificarlo inmediatamente al remitente y destruir el mensaje. > > > Todas las opiniones contenidas en este mail son propias del autor del > > > mensaje y no necesariamente coinciden con las de Radiomovil Dipsa, S.A. > > > de C.V. o alguna de sus empresas controladas, controladoras, afiliadas > > > y subsidiarias. Este mensaje intencionalmente no contiene acentos. > > > > > > This message is for the sole use of the person or entity to whom it is > > > being sent. Therefore, it contains strictly confidential and legally > > > protected material whose disclosure is subject to penalty by law. If > > > the person reading this message is not the one to whom it is being sent > > > and/or is not an employee or the responsible agent for this > > > information, this person is herein notified that any unauthorized > > > dissemination, distribution or copying of the materials included in > > > this facsimile is strictly prohibited. If you received this document > > > by mistake please notify immediately to the subscriber and destroy the > > > message. Any opinions contained in this e-mail are those of the author > > > of the message and do not necessarily coincide with those of Radiomovil > > > Dipsa, S.A. de C.V. or any of its control, controlled, affiliates and > > > subsidiaries companies. No part of this message or attachments may be > > > used or reproduced in any manner whatsoever. > > > > ___ > > Sent by ePrompter, the premier email notification software. > > Free download at http://www.ePrompter.com. -- PHP Database Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
RE: [PHP-DB] Duplicate rows
Thanks to every one that gave their inputs in order to solve my problem. It was solved with a simple SQL statement using a sub-query like: insert into products (id, code, desc, everused) select distinct '', code,desc, everused from catalog Now my products table has only 8500 records!! -Original Message- From: Ade Olonoh [mailto:[EMAIL PROTECTED] Sent: MiƩrcoles, 01 de Marzo de 2006 12:19 p.m. To: Miguel Guirao Cc: php-db@lists.php.net Subject: Re: [PHP-DB] Duplicate rows Assuming you're using MySQL, instead of using INSERT INTO, you can use REPLACE INTO instead. If you have unique keys on that table, the new record will overwrite existing records with the same unique keys instead of creating a new one. http://dev.mysql.com/doc/refman/5.0/en/replace.html --Ade. Miguel Guirao wrote: > My dear beloved friends, > > I have a catalog of products that a product provider gave, sadly for me, in > this CSV file there are many duplicated rows. > I edited the file in my Linux system with the "uniq -u" command, and it > worked somewhat fine, it eliminated some duplicated rows, originally the > file had 24K rows, and now it has been reduced to 15k rows. > > Anyhow, there are still duplicated rows, and since this is a catalog, it > should not have duplicated rows!!! > Now the catalog has been has been loaded into the DB. > > How can I continue eliminating duplicated rows? > As far as I remember the is a sentence in SQL to only show ONE row of > duplicated rows, maybe if I do a select using this sentence and then put > this new recordset in another table, it will work!! > > Any ideas? > > --- > Miguel Guirao Aguilera > Logistica R8 TELCEL > Tel. (999) 960.7994 > > > Este mensaje es exclusivamente para el uso de la persona o entidad a quien esta dirigido; contiene informacion estrictamente confidencial y legalmente protegida, cuya divulgacion es sancionada por la ley. Si el lector de este mensaje no es a quien esta dirigido, ni se trata del empleado o agente responsable de esta informacion, se le notifica por medio del presente, que su reproduccion y distribucion, esta estrictamente prohibida. Si Usted recibio este comunicado por error, favor de notificarlo inmediatamente al remitente y destruir el mensaje. Todas las opiniones contenidas en este mail son propias del autor del mensaje y no necesariamente coinciden con las de Radiomovil Dipsa, S.A. de C.V. o alguna de sus empresas controladas, controladoras, afiliadas y subsidiarias. Este mensaje intencionalmente no contiene acentos. > > This message is for the sole use of the person or entity to whom it is being sent. Therefore, it contains strictly confidential and legally protected material whose disclosure is subject to penalty by law. If the person reading this message is not the one to whom it is being sent and/or is not an employee or the responsible agent for this information, this person is herein notified that any unauthorized dissemination, distribution or copying of the materials included in this facsimile is strictly prohibited. If you received this document by mistake please notify immediately to the subscriber and destroy the message. Any opinions contained in this e-mail are those of the author of the message and do not necessarily coincide with those of Radiomovil Dipsa, S.A. de C.V. or any of its control, controlled, affiliates and subsidiaries companies. No part of this message or attachments may be used or reproduced in any manner whatsoever. > > -- Ade Olonoh - Independent Software Developer http://ade.olonoh.com | http://blog.olonoh.com -- PHP Database Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php -- PHP Database Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php