At 21:46 -0800 on 11/11/2011, Sumtingwong wrote about Re: Large text search:

 > You can do it in BBEdit using a Perl script, but in what form do you
 want the results?

John, thanks for your reply.  I have spent my evenings attempting to
write a pithy one liner from the command line to do this, but the
resolution is just not there with grep.  All of the files to be
searched contain paragraphs of text that is soft wrapped (I don't know
if that is the correct term, sorry).  I have not written any Perl in
over 10 years, time to break out the books!

What is needed?     A frequency count of each word in the input file
for each file that was searched.  For example, the first word of the
input file is "it".  Document one is searched for "it" and it shows up
248 times.  Optimal output would be (in tabbed columns):
it     Document 1     248

I know the output is going to be huge (as the input file is rather
large), but that is fine--I just need to get to the analysis part at
this point.

Cheers!

I'm sorry that I can not help you with a BBE based solution but I think you might be attempting to use the wrong tool for this project.

This is the type of project that I feel is better suited to use of a database like mySQL. You read the first file to populate the database. As you read each new file you do the updates. Once you are done you can do a query on each word and it can tell you total occurrences or occurrences per file.

Note that the actual updating of the database needs a program to access the database while once populated the results can be done with any database access utility that accepts the SQL query.

.

--
You received this message because you are subscribed to the "BBEdit Talk" discussion group on Google Groups.
To post to this group, send email to bbedit@googlegroups.com
To unsubscribe from this group, send email to
bbedit+unsubscr...@googlegroups.com
For more options, visit this group at
<http://groups.google.com/group/bbedit?hl=en>
If you have a feature request or would like to report a problem, please email "supp...@barebones.com" rather than posting to the group.
Follow @bbedit on Twitter: <http://www.twitter.com/bbedit>

<<attachment: ;-) Wink.gif>>

Reply via email to