Le 22/03/2017 à 17:10, cyrille henry a écrit : > > > Le 22/03/2017 à 17:01, Jack a écrit : >> Good Idea ! >> Just need to order the textfile (In fact, the file is not totally >> ordered) ;) >> Thanx. >> Speaking on this topic, give me a new idea on the good method to >> adopt. :) > > since you can do it in a non real time way, I think python have a sort > function that can do this easily. > or try with libre office.
Or command line : $ sort -k1 -g linksIdOK.txt ++ Jack > > cheers > c > >> ++ >> >> Jack >> >> >> >> Le 22/03/2017 à 16:46, cyrille henry a écrit : >>> if you textfile is composed of 2 row of number you can optimize the >>> search with prior treatment. >>> >>> 1 : order the index column (already done in your example) >>> 2 : create 2 table of start index, and number of occurrence of this >>> index >>> in you example, the "start index table" would be 0 at 345594, 5 at >>> 345595, 15 at 345596, 16 at 345598 >>> the "number of occurrence index table" would be : 5 at 345594, 10 at >>> 345595, 1 at 345596, 4 at 345598 >>> 3 : put column 2 of you textfile in a "data table" >>> >>> now, when searching for 345595, you just have to [tabread table1] and >>> [tabread table2] at position 345595, and with a small until loop you >>> just have to read the data table only where needed. >>> >>> cheers >>> c >>> >>> Le 22/03/2017 à 14:34, Jack a écrit : >>>> I guess my 2 precedent mails were enough clear. >>>> But i will answer at each point : >>>> >>>> 1) My previous mails : >>>> I need to find every lines of a textfile containing a word. >>>> The textfile has 2.539.592 lines. >>>> Now, i am using [msgfile] from zexy because i can find a line, skip a >>>> line and find again ... until the end of the textfile. >>>> But, i am wondering if there is an other object (in an other library) >>>> faster, specialized in this work ? >>>> ... >>>> The textfile has only two "strings" by line. >>>> Here, 20 lines of the textfile : >>>> >>>> 345594 577427 >>>> 345594 567267 >>>> 345594 528911 >>>> 345594 534435 >>>> 345594 523087 >>>> 345595 374384 >>>> 345595 377303 >>>> 345595 380544 >>>> 345595 379911 >>>> 345595 557020 >>>> 345595 552396 >>>> 345595 562487 >>>> 345595 460842 >>>> 345595 428449 >>>> 345595 424095 >>>> 345596 447676 >>>> 345598 579883 >>>> 345598 379495 >>>> 345598 379039 >>>> 345598 380328 >>>> >>>> 2) See above >>>> 3) See above >>>> 4) See above >>>> 5) Linux/Ubuntu 16.10/Pd 0.47.1 >>>> 6) you abuse :) >>>> >>>> ++ >>>> >>>> Jack >>>> >>>> >>>> >>>> >>>> Le 22/03/2017 à 13:31, Lorenzo Sutton a écrit : >>>>> Hi, >>>>> >>>>> On 22/03/2017 13:01, Jack wrote: >>>>>> I need to find all instances that math to the first row. >>>>>> It is not possible with [text search] if i am right. >>>>> >>>>> I think you should outline your use case/problem in more detail. This >>>>> should be a good practice when asking for support on the Mailing List. >>>>> >>>>> Example: >>>>> >>>>> 1) I have a text file where each line contains a two integers >>>>> separated >>>>> by a space (" ") char - such as (possibly paste a part of the file on >>>>> pastebin or similar too). >>>>> 213214 12313 >>>>> 123223 13213 >>>>> >>>>> 2) My file is [always/at least/circa/ ...] 2,539,592 lines long >>>>> >>>>> 3) My algorithm should find all subsequent lines matching the first >>>>> line >>>>> in the file and return [all line numbers for matches / the total count >>>>> of matched lines / ...] >>>>> >>>>> 3) I want the algorithm to be [as fast as possible / run in under 1 >>>>> second / run in under 1ms / ... ] >>>>> >>>>> 4) I [want to / do not need to] use Pd Vanilla >>>>> >>>>> 5) My patch should run on [All platforms / Windows / OSX / Linux / >>>>> ...] >>>>> >>>>> 6) My patch should run [on potentially any machine / on a Raspberry >>>>> Pi / >>>>> on a 1990s 386 machine / on my digital toaster where I have compiled a >>>>> custom version of Pd / ... ] >>>>> >>>>> :) >>>>> >>>>> >>>>>> ++ >>>>>> >>>>>> Jack >>>>>> >>>>>> >>>>>> >>>>>> Le 22/03/2017 à 08:27, Liam Goodacre a écrit : >>>>>>> You can also use [text search], although t's not so easy to find >>>>>>> more >>>>>>> than the first instance. If you don't mind taking a extra step, you >>>>>>> could give each line a third term, which is the line number. Then >>>>>>> you >>>>>>> can use the "> 3" argument for [text search] to find matches s >>>>>>> >>>>>>> >>>>>>> >>>>>>> ------------------------------------------------------------------------ >>>>>>> >>>>>>> >>>>>>> *From:* Pd-list <pd-list-boun...@lists.iem.at> on behalf of Jack >>>>>>> <j...@rybn.org> >>>>>>> *Sent:* 21 March 2017 18:14 >>>>>>> *To:* pd-list@lists.iem.at >>>>>>> *Subject:* [PD] Fastest way to find lines in text file >>>>>>> >>>>>>> Hello, >>>>>>> >>>>>>> I need to find every lines of a textfile containing a word. >>>>>>> The textfile has 2.539.592 lines. >>>>>>> Now, i am using [msgfile] from zexy because i can find a line, >>>>>>> skip a >>>>>>> line and find again ... until the end of the textfile. >>>>>>> But, i am wondering if there is an other object (in an other >>>>>>> library) >>>>>>> faster, specialized in this work ? >>>>>>> Thanx. >>>>>>> ++ >>>>>>> >>>>>>> Jack >>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Pd-list@lists.iem.at mailing list >>>>>>> UNSUBSCRIBE and account-management -> >>>>>>> https://lists.puredata.info/listinfo/pd-list >>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Pd-list@lists.iem.at mailing list >>>>>>> UNSUBSCRIBE and account-management -> >>>>>>> https://lists.puredata.info/listinfo/pd-list >>>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Pd-list@lists.iem.at mailing list >>>>>> UNSUBSCRIBE and account-management -> >>>>>> https://lists.puredata.info/listinfo/pd-list >>>>>> >>>>> >>>>> _______________________________________________ >>>>> Pd-list@lists.iem.at mailing list >>>>> UNSUBSCRIBE and account-management -> >>>>> https://lists.puredata.info/listinfo/pd-list >>>> >>>> >>>> _______________________________________________ >>>> Pd-list@lists.iem.at mailing list >>>> UNSUBSCRIBE and account-management -> >>>> https://lists.puredata.info/listinfo/pd-list >>>> >>> >>> _______________________________________________ >>> Pd-list@lists.iem.at mailing list >>> UNSUBSCRIBE and account-management -> >>> https://lists.puredata.info/listinfo/pd-list >> >> >> _______________________________________________ >> Pd-list@lists.iem.at mailing list >> UNSUBSCRIBE and account-management -> >> https://lists.puredata.info/listinfo/pd-list >> > > _______________________________________________ > Pd-list@lists.iem.at mailing list > UNSUBSCRIBE and account-management -> > https://lists.puredata.info/listinfo/pd-list _______________________________________________ Pd-list@lists.iem.at mailing list UNSUBSCRIBE and account-management -> https://lists.puredata.info/listinfo/pd-list