Hi Vish - Did you manage to get this working somehow? I am also having
problems using the NLTK in a django app. I'd appreciate any tips you have
on setting things up correctly.
Thanks,
Graham
On Friday, February 1, 2013 10:49:00 AM UTC, Avishalom Shalit wrote:
>
> thanks.
> actually this is an internal app, only available on our VPN,
> so security is not an issue ,
> and i only expect a maximum of 4 users
>
> i will look at the other setups.
> thanks
>
> -- vish
>
>
>
> On 31 January 2013 23:48, Emanuel Ilyayev <[email protected]
> <javascript:>>wrote:
>
>> I don't know enough NLTK but I work with django :)
>>
>> From Asaf's description it looks like you have to change your
>> architecture. Apache - in it's default configuration - is not efficient in
>> working with heavy processes because it creates a new process for each
>> request. There are better setups like using gUnicorn or uWSGI that load n
>> workers and distribute the work between them (usually n = number of cores X
>> 2 + 1).
>>
>> More robust and scalable setup would include a separate workers that
>> answer to the NLTK requests asynchronously and django approaches these
>> workers via a message queue. This setup will allow you to put your NLTK
>> workers even on a separate machine without creating situation where your
>> web server is competing with your NLTK workers on limited resources (CPU
>> and RAM).
>>
>> Even if you will eventually find the way to configure apache to load NLTK
>> without crashing - the URL that handles NLTK requests would be a perfect
>> point to attack you server and to bring it into a DOS (denial of service)
>> situation using only a couple of strong machines approaching this URL....
>>
>> I urge you to read a little bit about gEvent and Celery to understand
>> what I'm talking about.
>>
>> HTH
>>
>> --
>> Emanuel
>>
>>
>>
>>
>> On Thu, Jan 31, 2013 at 7:30 PM, asaf greenberg
>> <[email protected]<javascript:>
>> > wrote:
>>
>>>
>>> i don't know enough django, but i worked with nltk.
>>> NLTK is a very heavy module, lagging on import is expected, especially
>>> if you're using certain modules.
>>>
>>> AFAIK you should `import' it only once, on server (re)start, and it
>>> costs about 10-30 secs (did you optimize with *pyc or *pyo?). unless you're
>>> short on RAM... but i hope that's not the case.
>>>
>>> NLTK has also many sub-modules, which can and should be disabled, for
>>> performance.
>>>
>>> Does it hang elsewhere (apart from server startup)?
>>> does it have a longer delay than 20-30 secs.?
>>>
>>>
>>>
>>> On 1/31/2013 6:44 PM, Avishalom Shalit wrote:
>>>
>>> As title.
>>>
>>> It just silently hangs.
>>>
>>> as far as i found on google, other people have ran into it,
>>> but nobody posted a solution.
>>>
>>> anybody overcame this before ?
>>>
>>> thanks
>>>
>>>
>>> -- vish
>>>
>>>
>>>
>>> _______________________________________________
>>> Python-il mailing [email protected]
>>> <javascript:>http://hamakor.org.il/cgi-bin/mailman/listinfo/python-il
>>>
>>>
>>>
>>> _______________________________________________
>>> Python-il mailing list
>>> [email protected] <javascript:>
>>> http://hamakor.org.il/cgi-bin/mailman/listinfo/python-il
>>>
>>>
>>
>> _______________________________________________
>> Python-il mailing list
>> [email protected] <javascript:>
>> http://hamakor.org.il/cgi-bin/mailman/listinfo/python-il
>>
>>
>
_______________________________________________
Python-il mailing list
[email protected]
http://hamakor.org.il/cgi-bin/mailman/listinfo/python-il