The first column is of strings ... Do you mean a single string as in "KerfufledAllaHasbalah" Or a "bunch of strings with some implied delimiter" such as "Kerfufled/Alla/Hasballah" where "/" is the separator between strings?
If the latter, the data needs to be normalized. --- The fact that there's a Highway to Hell but only a Stairway to Heaven says a lot about anticipated traffic volume. >-----Original Message----- >From: sqlite-users [mailto:sqlite-users- >boun...@mailinglists.sqlite.org] On Behalf Of Peng Yu >Sent: Wednesday, 10 April, 2019 08:01 >To: SQLite mailing list >Subject: Re: [sqlite] [EXTERNAL] compressed sqlite3 database file? > >I don't know specifically what you refer to as data normalization. My >guess is something like this. But it is irrelevant to my case. > >https://www.studytonight.com/dbms/database-normalization.php > >For my specific TSV file, it has about 50 million rows and just two >columns. The first column is of strings and the second column is of >integers. All the strings in the first column are unique (some >strings >may be substrings of other strings though). > >On 4/10/19, Hick Gunter <h...@scigames.at> wrote: >> I have the distinct impression that you are attempting to convert a >flat >> file into a naked table and pretending that the result is a >(relational) >> database. >> >> Please rethink your approach. There is a design process called >> "normalization" that needs to be done first. This will identify >"entities" >> (with "attributes") and "relations" that will greatly reduce data >> duplication found in flat files. > >-- >Regards, >Peng >_______________________________________________ >sqlite-users mailing list >sqlite-users@mailinglists.sqlite.org >http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users