Good Idea ! Just need to order the textfile (In fact, the file is not totally ordered) ;) Thanx. Speaking on this topic, give me a new idea on the good method to adopt. :) ++
Jack Le 22/03/2017 à 16:46, cyrille henry a écrit : > if you textfile is composed of 2 row of number you can optimize the > search with prior treatment. > > 1 : order the index column (already done in your example) > 2 : create 2 table of start index, and number of occurrence of this index > in you example, the "start index table" would be 0 at 345594, 5 at > 345595, 15 at 345596, 16 at 345598 > the "number of occurrence index table" would be : 5 at 345594, 10 at > 345595, 1 at 345596, 4 at 345598 > 3 : put column 2 of you textfile in a "data table" > > now, when searching for 345595, you just have to [tabread table1] and > [tabread table2] at position 345595, and with a small until loop you > just have to read the data table only where needed. > > cheers > c > > Le 22/03/2017 à 14:34, Jack a écrit : >> I guess my 2 precedent mails were enough clear. >> But i will answer at each point : >> >> 1) My previous mails : >> I need to find every lines of a textfile containing a word. >> The textfile has 2.539.592 lines. >> Now, i am using [msgfile] from zexy because i can find a line, skip a >> line and find again ... until the end of the textfile. >> But, i am wondering if there is an other object (in an other library) >> faster, specialized in this work ? >> ... >> The textfile has only two "strings" by line. >> Here, 20 lines of the textfile : >> >> 345594 577427 >> 345594 567267 >> 345594 528911 >> 345594 534435 >> 345594 523087 >> 345595 374384 >> 345595 377303 >> 345595 380544 >> 345595 379911 >> 345595 557020 >> 345595 552396 >> 345595 562487 >> 345595 460842 >> 345595 428449 >> 345595 424095 >> 345596 447676 >> 345598 579883 >> 345598 379495 >> 345598 379039 >> 345598 380328 >> >> 2) See above >> 3) See above >> 4) See above >> 5) Linux/Ubuntu 16.10/Pd 0.47.1 >> 6) you abuse :) >> >> ++ >> >> Jack >> >> >> >> >> Le 22/03/2017 à 13:31, Lorenzo Sutton a écrit : >>> Hi, >>> >>> On 22/03/2017 13:01, Jack wrote: >>>> I need to find all instances that math to the first row. >>>> It is not possible with [text search] if i am right. >>> >>> I think you should outline your use case/problem in more detail. This >>> should be a good practice when asking for support on the Mailing List. >>> >>> Example: >>> >>> 1) I have a text file where each line contains a two integers separated >>> by a space (" ") char - such as (possibly paste a part of the file on >>> pastebin or similar too). >>> 213214 12313 >>> 123223 13213 >>> >>> 2) My file is [always/at least/circa/ ...] 2,539,592 lines long >>> >>> 3) My algorithm should find all subsequent lines matching the first line >>> in the file and return [all line numbers for matches / the total count >>> of matched lines / ...] >>> >>> 3) I want the algorithm to be [as fast as possible / run in under 1 >>> second / run in under 1ms / ... ] >>> >>> 4) I [want to / do not need to] use Pd Vanilla >>> >>> 5) My patch should run on [All platforms / Windows / OSX / Linux / ...] >>> >>> 6) My patch should run [on potentially any machine / on a Raspberry Pi / >>> on a 1990s 386 machine / on my digital toaster where I have compiled a >>> custom version of Pd / ... ] >>> >>> :) >>> >>> >>>> ++ >>>> >>>> Jack >>>> >>>> >>>> >>>> Le 22/03/2017 à 08:27, Liam Goodacre a écrit : >>>>> You can also use [text search], although t's not so easy to find more >>>>> than the first instance. If you don't mind taking a extra step, you >>>>> could give each line a third term, which is the line number. Then you >>>>> can use the "> 3" argument for [text search] to find matches s >>>>> >>>>> >>>>> >>>>> ------------------------------------------------------------------------ >>>>> >>>>> *From:* Pd-list <pd-list-boun...@lists.iem.at> on behalf of Jack >>>>> <j...@rybn.org> >>>>> *Sent:* 21 March 2017 18:14 >>>>> *To:* pd-list@lists.iem.at >>>>> *Subject:* [PD] Fastest way to find lines in text file >>>>> >>>>> Hello, >>>>> >>>>> I need to find every lines of a textfile containing a word. >>>>> The textfile has 2.539.592 lines. >>>>> Now, i am using [msgfile] from zexy because i can find a line, skip a >>>>> line and find again ... until the end of the textfile. >>>>> But, i am wondering if there is an other object (in an other library) >>>>> faster, specialized in this work ? >>>>> Thanx. >>>>> ++ >>>>> >>>>> Jack >>>>> >>>>> >>>>> _______________________________________________ >>>>> Pd-list@lists.iem.at mailing list >>>>> UNSUBSCRIBE and account-management -> >>>>> https://lists.puredata.info/listinfo/pd-list >>>>> >>>>> >>>>> _______________________________________________ >>>>> Pd-list@lists.iem.at mailing list >>>>> UNSUBSCRIBE and account-management -> >>>>> https://lists.puredata.info/listinfo/pd-list >>>>> >>>> >>>> >>>> _______________________________________________ >>>> Pd-list@lists.iem.at mailing list >>>> UNSUBSCRIBE and account-management -> >>>> https://lists.puredata.info/listinfo/pd-list >>>> >>> >>> _______________________________________________ >>> Pd-list@lists.iem.at mailing list >>> UNSUBSCRIBE and account-management -> >>> https://lists.puredata.info/listinfo/pd-list >> >> >> _______________________________________________ >> Pd-list@lists.iem.at mailing list >> UNSUBSCRIBE and account-management -> >> https://lists.puredata.info/listinfo/pd-list >> > > _______________________________________________ > Pd-list@lists.iem.at mailing list > UNSUBSCRIBE and account-management -> > https://lists.puredata.info/listinfo/pd-list _______________________________________________ Pd-list@lists.iem.at mailing list UNSUBSCRIBE and account-management -> https://lists.puredata.info/listinfo/pd-list