Hi -
Ok, I've upgraded using factor-macosx-x86-32-2013-07-25-14-21.dmg,
still Version 0.97. Same issue with Factor's "which":

IN: scratchpad USE: tools.which
IN: scratchpad "couchdb" which .
f

IN: scratchpad "python" which .
"/usr/bin/python"

- The trouble appears to be with reporting my PATH properly, via getenv:

IN: scratchpad USE: environment
IN: scratchpad "PATH" os-env .
"/usr/bin:/bin:/usr/sbin:/sbin"

IN: scratchpad USE: unix.ffi
IN: scratchpad "PATH" getenv .
"/usr/bin:/bin:/usr/sbin:/sbin"

IN: scratchpad \ getenv see
USING: alien.c-types alien.syntax ;
IN: unix.ffi
LIBRARY: libc FUNCTION: c-string getenv ( c-string name ) ;
    inline

- Here's my actual PATH, as seen in the terminal:

➜  ~ git:(master) ✗ echo $PATH
/usr/local/bin:/usr/local/opt/ruby/bin:/usr/bin:/bin:/usr/sbin:/sbin:/Users/cwalston/factor:/Users/cwalston/bin:/usr/local/go/bin:/usr/local/lib/node_modules:/usr/local/narwhal/bin:/usr/texbin:/usr/X11/bin:/usr/local/sbin:/Users/cwalston/.gem/ruby/1.8/bin:/Applications/Mozart.app/Contents/Resources/bin

- whereby which correctly finds couchdb:

➜  ~ git:(master) ✗ which couchdb
/usr/local/bin/couchdb

So, Factor's "which" (et al.) doesn't search beyond
/usr/bin:/bin:/usr/sbin:/sbin.

Reading through man getenv (GETENV(3), on OSX 10.6.8 ), doesn't give me
a clue as to how to rectify this short-sightedness via the libc getenv.

This is probably a side issue to my docsplit quandary (but maybe not).
Anyone see a way to report my actual PATH to "which" in Factor? My PATH is
augmented in my .zshrc. I don't understand why the libc function doesn't
read it. Odd, indeed!

~cw


On Sat, Feb 8, 2014 at 4:39 PM, John Benediktsson <mrj...@gmail.com> wrote:

> Thats odd, Factor's "which" just looks in the $PATH for your executable.
>
>     IN: scratchpad "PATH" os-env
>
> You can read a bit about how its implemented cross-platform:
>
>     http://re-factor.blogspot.com/2013/01/which.html
>
>
> On Sat, Feb 8, 2014 at 2:30 PM, CW Alston <cwalsto...@gmail.com> wrote:
>
>> Thanks for the replies. Maybe a clue here - I get this from "which":
>>
>> IN: scratchpad USE: tools.which
>> IN: scratchpad "docsplit" which .
>> f
>> IN: scratchpad "couchdb" which .
>> f
>> IN: scratchpad "ruby" which  .
>> f
>>
>> Whereas in the terminal:
>>
>> ➜  ~ git:(master) ✗ which docsplit
>> /usr/local/opt/ruby/bin/docsplit
>>
>> ➜  ~ git:(master) ✗ which couchdb
>> /usr/local/bin/couchdb
>>
>> ➜  ~ git:(master) ✗ which ruby
>> /usr/local/bin/ruby
>>
>> Let me try moving up to the most recent development release
>> & see if the problem disappears. I'll get back to you.
>>
>> Best,
>> ~cw
>>
>>
>>
>> On Sat, Feb 8, 2014 at 7:42 AM, John Benediktsson <mrj...@gmail.com>wrote:
>>
>>> Well if you want process output, you can do something like:
>>>
>>>     { "docsplit" "text" "--no-clean" "-l" "path" } utf8 [ lines ]
>>> with-process-reader
>>>
>>> or without output, using a single command string:
>>>
>>>     "docsplit text --no-clean -l path" run-process drop
>>>
>>> You can docsplit a directory of files:
>>>
>>>     : docsplit ( file -- )
>>>         { "docsplit" "text" "--no-clean" "-l" }
>>>         swap prefix run-process drop ;
>>>
>>>     : docsplit-all ( path -- )
>>>         directory-files [ docsplit ] each ;
>>>
>>> And concatenate all the files in a directory:
>>>
>>>     # bash
>>>     ls *.factor | sort | xargs -I '{}' cat '{}'
>>>
>>>     # factor
>>>     : cat-results ( path -- )
>>>         directory-files [ ".txt" tail? ] filter natural-sort
>>>         [ file-lines ] map concat ;
>>>
>>> Or something like that, which part are you having problems with?
>>>
>>> Best,
>>> John.
>>>
>>>
>>>
>>> On Sat, Feb 8, 2014 at 2:32 AM, CW Alston <cwalsto...@gmail.com> wrote:
>>>
>>>> Hi folks -
>>>>
>>>> I am thrilled to find a versatile open-source optical character
>>>> recognition
>>>> engine called docsplit <http://documentcloud.github.io/docsplit/>.
>>>> I've got it installed easily as a ruby gem, & it works
>>>> just great on my Mac as a shell command (it also provides a ruby
>>>> module):
>>>>
>>>> ➜  ~ git:(master) ✗ which docsplit
>>>> /usr/local/opt/ruby/bin/docsplit
>>>> ➜  ~ git:(master) ✗
>>>>
>>>> I need such a tool to extract text from a deep directory tree, with a
>>>> couple thousand
>>>> folders. Each leaf folder contains 3-6 scanned pdfs (in Chinese &
>>>> English), from which
>>>> docsplit makes a plaintext (.txt) file with the same basename,
>>>> deposited in the same
>>>> leaf directory. My Factor vocab can easily visit each leaf dir &
>>>> prepare to pass each pdf
>>>> there to docsplit in the format it happily handles in the terminal (I
>>>> use oh-my-zsh & iTerm2).
>>>> My Factor code chokes on this intermediate step, trying to call
>>>> docsplit.
>>>>
>>>> Going to the terminal, I have to first cd to the directory containing
>>>> the pdfs, e.g.,
>>>>
>>>> ➜  ~ git:(master) ✗ cd /path/to/1_long_gu
>>>>
>>>> then call docsplit with the appropriate flags on each pdf:
>>>>
>>>> ➜  1_long_gu git:(master) ✗ docsplit text --no-clean -l chi_sim
>>>> long_gu001.pdf
>>>> ➜  1_long_gu git:(master) ✗ docsplit text --no-clean -l eng
>>>> long_gu002.pdf
>>>>
>>>> etc., for each pdf, & docsplit gives back a bunch of text files in the
>>>> dir like
>>>>
>>>> /path/to/1_long_gu/long_gu001.txt
>>>>
>>>> In the terminal, even a compound phrase like the following works
>>>> without a hitch:
>>>>
>>>> ➜  ~ git:(master) ✗ cd /path/to/1_long_gu ; docsplit text --no-clean -l
>>>> chi_sim long_gu001.pdf ; docsplit text --no-clean -l eng long_gu002.pdf ;
>>>> docsplit text --no-clean -l eng long_gu003.pdf ;...
>>>> ➜  1_long_gu git:(master) ✗
>>>>
>>>> So, working from the terminal, I wind up with a series of text files in
>>>> /path/to/1_long_gu
>>>> that my Factor vocab amalgamates into a single text file (with
>>>> whitespace in filename), e.g.,
>>>> /path/to/1_long_gu/long gu.txt, which I can edit for mistakes, and
>>>> upload to a couchdb database.
>>>> Joy!
>>>>
>>>> But I haven't been able to work out how to accomplish this docsplit
>>>> call from Factor code.
>>>> I have no problem traversing the directory tree (Factor's word
>>>> each-file & the like come in
>>>> very handy). I've experimented with io.launcher, io.pipes, shell
>>>> scripts (bash, zsh, factor),
>>>> & autoload shell functions, but flunked out. No errors with io.launcher
>>>> tries; just no result.
>>>> Need to learn something here. I routinely launch couchdb as a detached
>>>> <process>.
>>>>
>>>> It would be such a boon to use docsplit in Factor. After a couple weeks
>>>> lost at sea with this,
>>>> I'm broadcasting a Mayday. Any suggestions?
>>>>
>>>> Thanks in advance,
>>>> ~cw
>>>>
>>>> --
>>>> *~ Memento Amori*
>>>>
>>>>
>>>> ------------------------------------------------------------------------------
>>>> Managing the Performance of Cloud-Based Applications
>>>> Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
>>>> Read the Whitepaper.
>>>>
>>>> http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk
>>>> _______________________________________________
>>>> Factor-talk mailing list
>>>> Factor-talk@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/factor-talk
>>>>
>>>>
>>>
>>
>>
>> --
>> *~ Memento Amori*
>>
>
>


-- 
*~ Memento Amori*
------------------------------------------------------------------------------
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk
_______________________________________________
Factor-talk mailing list
Factor-talk@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/factor-talk

Reply via email to