Re: [Trisquel-users] Grep consumes all my RAM and swap ding a big job

2019-07-29 Thread amenex
The scholar, focusing on the mathematics, admonishes the "pragmatic idealist": >> That is not a problem. That is an algorithm whose first step is not even clear: >> a Google search on {stats "view all sites"} returns a "normal response", a list of 224,000 pages from different websites.

Re: [Trisquel-users] Grep consumes all my RAM and swap ding a big job

2019-07-28 Thread amenex
Magic Banana politely requested: >> please express the actual problem clearly. [paraphrasing] in less than ten paragraphs The folowing scheme worked OK with a list of about a million visitors' hostnames. (1) Collect Recent Visitor data with a Google search on {stats "view all sites"}.

Re: [Trisquel-users] Grep consumes all my RAM and swap ding a big job

2019-07-28 Thread amenex
Hmmm. We seem both to be writing at once ... Magic Banana is saying: Quoting amenex: > I want to guard against double-counting, as with 01j01.txt or 01j02.txt vs 02j01.txt, and that requires > some heavy-duty concentration. >> "My" solution (since my first post in this thread) joins one

Re: [Trisquel-users] Grep consumes all my RAM and swap ding a big job

2019-07-28 Thread amenex
Magic Banana may be re-stating my objective differently than I have been stating it: >> Also, if, in my previous post, I have understood what you wanted to do with Leafpad and 43 manual executions >> (i.e., "join every file with the union of all other files"), here is a slightly modified

Re: [Trisquel-users] Grep consumes all my RAM and swap ding a big job

2019-07-28 Thread amenex
Magic Banana requested clarification of my future plans: In reply to what I said: > Repeating it for the other 43 combinations should now be a breeze, as I can switch the file names around with Leafpad. >> I am not sure I understand what you want to do (join every file with the union of

Re: [Trisquel-users] Grep consumes all my RAM and swap ding a big job

2019-07-28 Thread amenex
Magic Banana asked: >> Here is what I actually proposed (where filenames are appended *before* the join): >> https://trisquel.info/forum/grep-consumes-all-my-ram-and-swap-ding-big-job#comment-142474 When I collapsed the script into a one-liner, it would start ... but nothing ensued for

Re: [Trisquel-users] Grep consumes all my RAM and swap ding a big job

2019-07-27 Thread amenex
Using awk to execute the series of join commands described above produces only syntax errors. Here are those commands: > join -1 2 -2 2 join -1 2 -2 2 join -1 2 -2 2 join -1 2 -2 2 join -1 2 -2 2 join -1 2 -2 2 join -1 2 -2 2 join -1 2 -2 2 join -1 2 -2 2 join -1 2 -2 2 join -1

Re: [Trisquel-users] Grep consumes all my RAM and swap ding a big job

2019-07-26 Thread amenex
Taking Magic Banana's cue, I applied the join command in round-robin fashion: > HNs.Ed.tropic.ssec.wisc.edu.txt join -1 2 -2 2 join -1 2 -2 2 join -1 2 -2 2 join -1 2 -2 2 join -1 2 -2 2 time cat `ls

Re: [Trisquel-users] Grep consumes all my RAM and swap ding a big job

2019-07-24 Thread amenex
Followup: Starting with the setup script: > time awk < HNs.bst.lt.txt '{print $2}' > HNusage/HNs.bst.lt/temp When the grep command acts on a group of files that have been sorted, the script works much more quickly as well as utilizing much less RAM without need of swap support: > time grep

Re: [Trisquel-users] Grep consumes all my RAM and swap ding a big job

2019-07-24 Thread amenex
Magic Banana wonders: > However, your input looks wrong: on line 674 of HNs.bst_.lt_.txt, the second column only contains the character 0... > and your 'grep' selects (among others) all the lines that include this character. I assume you want whole domain matches. I checked the original