subject:"\[web2py\] Re\: A basic problem about threading and time consuming function..."

[web2py] Re: A basic problem about threading and time consuming function...

2016-05-05 Thread Mirek Zvolský

Debian/nginx/systemd deployment:

I made scheduler working with help:
https://groups.google.com/d/msg/web2py/eHXwines4o0/i3WqDlKjCQAJ
and
https://groups.google.com/d/msg/web2py/jFWNnz5cl9U/UpBSkxf4_2kJ

Thank you very much Niphlod, Michael M, Brian M




Dne čtvrtek 5. května 2016 13:06:04 UTC+2 Mirek Zvolský napsal(a):
>
> Yes.
> I run with scheduler already. It is really nice and great !
> Going away from the ajax solution it was easy and there was almost no 
> problem. (I have very easy parameters for the task and I return nothing, 
> just I save into db.)
> The result code is cleaner (one task starting call instead of rendering 
> hidden html element + js reading from it + ajax call + parsing args).
>
> Maybe my previous mistake (I mean my message here in this thread) will be 
> helpfull for others to go with scheduler.
>
> What I need to do now is deployment for the scheduler (on Debian and 
> nginx).
>
> PS:
> It was fast but important
> - find where I can see code errors (in schedulers db tables),
> - how to set timeout (in the function call)
>
> Here is the code example - controller and models/scheduler.py:
> def find():
> def onvalidation(form):
> form.vars.asked = datetime.datetime.utcnow()
> form = SQLFORM(db.question)
> if form.process(onvalidation=onvalidation).accepted:
> scheduler.queue_task(task_catalogize,
> pvars={'question_id': form.vars.id, 'question': 
> form.vars.question, 'asked': str(form.vars.asked)},  # str to json 
> serialize datetime
> timeout=300)
> return dict(form=form)
>
> import datetime
> from gluon.scheduler import Scheduler
> def task_catalogize(question_id, question, asked):
> asked = datetime.datetime.strptime(asked, '%Y-%m-%d %H:%M:%S.%f')  # 
> deserialize datetime
> inserted = some_db_actions(question)
> db.question[question_id] = {
> 'duration': round((datetime.datetime.utcnow() - 
> asked).total_seconds(), 0), # same/similar we have in scheduler db 
> tables
> 'inserted': inserted}
> db.commit()
> scheduler = Scheduler(db)
>
>
>
> Dne úterý 3. května 2016 14:21:23 UTC+2 Niphlod napsal(a):
>>
>> NP: as everything it's not the silver bullet but with the redis 
>> incarnation I'm sure you can achieve less than 3 second (if you tune 
>> heartbeat even less than 1 second) from when the task gets queued to when 
>> it gets processed.
>>
>> On Tuesday, May 3, 2016 at 12:32:13 PM UTC+2, Mirek Zvolský wrote:
>>>
>>> Hi, Niphlod.
>>>
>>> After I have read something about scheduler,
>>> I am definitively sorry for my previous notes
>>> and I choose web2py scheduler of course.
>>>
>>> It will be my first use of it (with much older ~3 years web2py app I 
>>> have used cron only),
>>> so it will take some time to learn with scheduler. But it is sure worth 
>>> to redesign it so.
>>>
>>> Thanks you are patient with me.
>>> Mirek
>>>
>>>
>>>
>>>
>>> Dne pondělí 2. května 2016 20:35:05 UTC+2 Mirek Zvolský napsal(a):

 You are right.
 At this time it works for me via ajax well and I will look carefully 
 for problems.
 If so, I will move to scheduler.

 I see this is exactly what Massimo(?) writes at the bottom of Ajax 
 chapter of the book.

 PS: about times:
 At notebook with mobile connection it takes 20-40s. So it could be 
 danger.
 At cloud server with SSD it takes 2-10s. But this will be my case. And 
 I feel better when the user can have typical response in 3s instead in 8s.





 Dne neděle 1. května 2016 22:10:31 UTC+2 Niphlod napsal(a):
>
> the statement "I don't need to use the scheduler, because I want to 
> start it as soon as possible" is flaky at best. If your "fetching" varies 
> from 2 to 20 seconds and COULD extend further to 60 seconds, waiting a 
> few 
> seconds for the scheduler to start the process is  uhm... debatable.
> Of course relying to ajax if your "feching" can be killed in the 
> process is the only other way.
>
> On Sunday, May 1, 2016 at 8:09:23 PM UTC+2, Mirek Zvolský wrote:
>>
>> Thanks for info and tips, 6 years later.
>>
>> What I try to do
>> is a form with single input, where user gives a query string
>> and then data about (usually ~300) books will be retrieved via z39 
>> and marc protocol/format, parsed and saved into local database.
>>
>> Of course this will take a time (2? 5? 20? seconds) and I decided
>> not to show the result immediately,
>> but show the same form with possibility to enter the next query + 
>> there is a list of pending queries (and their status - via ajax testing 
>> every 5 seconds)
>>
>> So my idea was to provide a return from the controller fast and 
>> before the return to start a new thread to retrieve/parse/save/commit 
>> data.
>>
>> From this discussion I understand that open new thread isn't best 
>>

[web2py] Re: A basic problem about threading and time consuming function...

2016-05-05 Thread Mirek Zvolský

Yes.
I run with scheduler already. It is really nice and great !
Going away from the ajax solution it was easy and there was almost no 
problem. (I have very easy parameters for the task and I return nothing, 
just I save into db.)
The result code is cleaner (one task starting call instead of rendering 
hidden html element + js reading from it + ajax call + parsing args).

Maybe my previous mistake (I mean my message here in this thread) will be 
helpfull for others to go with scheduler.

What I need to do now is deployment for the scheduler (on Debian and nginx).

PS:
It was fast but important
- find where I can see code errors (in schedulers db tables),
- how to set timeout (in the function call)

Here is the code example - controller and models/scheduler.py:
def find():
def onvalidation(form):
form.vars.asked = datetime.datetime.utcnow()
form = SQLFORM(db.question)
if form.process(onvalidation=onvalidation).accepted:
scheduler.queue_task(task_catalogize,
pvars={'question_id': form.vars.id, 'question': 
form.vars.question, 'asked': str(form.vars.asked)},  # str to json 
serialize datetime
timeout=300)
return dict(form=form)

import datetime
from gluon.scheduler import Scheduler
def task_catalogize(question_id, question, asked):
asked = datetime.datetime.strptime(asked, '%Y-%m-%d %H:%M:%S.%f')  # 
deserialize datetime
inserted = some_db_actions(question)
db.question[question_id] = {
'duration': round((datetime.datetime.utcnow() - 
asked).total_seconds(), 0), # same/similar we have in scheduler db 
tables
'inserted': inserted}
db.commit()
scheduler = Scheduler(db)



Dne úterý 3. května 2016 14:21:23 UTC+2 Niphlod napsal(a):
>
> NP: as everything it's not the silver bullet but with the redis 
> incarnation I'm sure you can achieve less than 3 second (if you tune 
> heartbeat even less than 1 second) from when the task gets queued to when 
> it gets processed.
>
> On Tuesday, May 3, 2016 at 12:32:13 PM UTC+2, Mirek Zvolský wrote:
>>
>> Hi, Niphlod.
>>
>> After I have read something about scheduler,
>> I am definitively sorry for my previous notes
>> and I choose web2py scheduler of course.
>>
>> It will be my first use of it (with much older ~3 years web2py app I have 
>> used cron only),
>> so it will take some time to learn with scheduler. But it is sure worth 
>> to redesign it so.
>>
>> Thanks you are patient with me.
>> Mirek
>>
>>
>>
>>
>> Dne pondělí 2. května 2016 20:35:05 UTC+2 Mirek Zvolský napsal(a):
>>>
>>> You are right.
>>> At this time it works for me via ajax well and I will look carefully for 
>>> problems.
>>> If so, I will move to scheduler.
>>>
>>> I see this is exactly what Massimo(?) writes at the bottom of Ajax 
>>> chapter of the book.
>>>
>>> PS: about times:
>>> At notebook with mobile connection it takes 20-40s. So it could be 
>>> danger.
>>> At cloud server with SSD it takes 2-10s. But this will be my case. And I 
>>> feel better when the user can have typical response in 3s instead in 8s.
>>>
>>>
>>>
>>>
>>>
>>> Dne neděle 1. května 2016 22:10:31 UTC+2 Niphlod napsal(a):

 the statement "I don't need to use the scheduler, because I want to 
 start it as soon as possible" is flaky at best. If your "fetching" varies 
 from 2 to 20 seconds and COULD extend further to 60 seconds, waiting a few 
 seconds for the scheduler to start the process is  uhm... debatable.
 Of course relying to ajax if your "feching" can be killed in the 
 process is the only other way.

 On Sunday, May 1, 2016 at 8:09:23 PM UTC+2, Mirek Zvolský wrote:
>
> Thanks for info and tips, 6 years later.
>
> What I try to do
> is a form with single input, where user gives a query string
> and then data about (usually ~300) books will be retrieved via z39 and 
> marc protocol/format, parsed and saved into local database.
>
> Of course this will take a time (2? 5? 20? seconds) and I decided
> not to show the result immediately,
> but show the same form with possibility to enter the next query + 
> there is a list of pending queries (and their status - via ajax testing 
> every 5 seconds)
>
> So my idea was to provide a return from the controller fast and before 
> the return to start a new thread to retrieve/parse/save/commit data.
>
> From this discussion I understand that open new thread isn't best idea.
> I think it could be still possible, because if my new thread could be 
> killed 60s later from the web server together with the original thread - 
> such possibility is not fatal problem for me here.
>
> However when (as I read here) this would be a little wild technology,
> and because other technologies mentioned here: 
> https://en.wikipedia.org/wiki/Comet_(programming) -paragraph 
> Aternatives, are too difficult for me,
> and because I don't want use a sched

[web2py] Re: A basic problem about threading and time consuming function...

2016-05-03 Thread Niphlod

NP: as everything it's not the silver bullet but with the redis incarnation 
I'm sure you can achieve less than 3 second (if you tune heartbeat even 
less than 1 second) from when the task gets queued to when it gets 
processed.

On Tuesday, May 3, 2016 at 12:32:13 PM UTC+2, Mirek Zvolský wrote:
>
> Hi, Niphlod.
>
> After I have read something about scheduler,
> I am definitively sorry for my previous notes
> and I choose web2py scheduler of course.
>
> It will be my first use of it (with much older ~3 years web2py app I have 
> used cron only),
> so it will take some time to learn with scheduler. But it is sure worth to 
> redesign it so.
>
> Thanks you are patient with me.
> Mirek
>
>
>
>
> Dne pondělí 2. května 2016 20:35:05 UTC+2 Mirek Zvolský napsal(a):
>>
>> You are right.
>> At this time it works for me via ajax well and I will look carefully for 
>> problems.
>> If so, I will move to scheduler.
>>
>> I see this is exactly what Massimo(?) writes at the bottom of Ajax 
>> chapter of the book.
>>
>> PS: about times:
>> At notebook with mobile connection it takes 20-40s. So it could be danger.
>> At cloud server with SSD it takes 2-10s. But this will be my case. And I 
>> feel better when the user can have typical response in 3s instead in 8s.
>>
>>
>>
>>
>>
>> Dne neděle 1. května 2016 22:10:31 UTC+2 Niphlod napsal(a):
>>>
>>> the statement "I don't need to use the scheduler, because I want to 
>>> start it as soon as possible" is flaky at best. If your "fetching" varies 
>>> from 2 to 20 seconds and COULD extend further to 60 seconds, waiting a few 
>>> seconds for the scheduler to start the process is  uhm... debatable.
>>> Of course relying to ajax if your "feching" can be killed in the process 
>>> is the only other way.
>>>
>>> On Sunday, May 1, 2016 at 8:09:23 PM UTC+2, Mirek Zvolský wrote:

 Thanks for info and tips, 6 years later.

 What I try to do
 is a form with single input, where user gives a query string
 and then data about (usually ~300) books will be retrieved via z39 and 
 marc protocol/format, parsed and saved into local database.

 Of course this will take a time (2? 5? 20? seconds) and I decided
 not to show the result immediately,
 but show the same form with possibility to enter the next query + there 
 is a list of pending queries (and their status - via ajax testing every 5 
 seconds)

 So my idea was to provide a return from the controller fast and before 
 the return to start a new thread to retrieve/parse/save/commit data.

 From this discussion I understand that open new thread isn't best idea.
 I think it could be still possible, because if my new thread could be 
 killed 60s later from the web server together with the original thread - 
 such possibility is not fatal problem for me here.

 However when (as I read here) this would be a little wild technology,
 and because other technologies mentioned here: 
 https://en.wikipedia.org/wiki/Comet_(programming) -paragraph 
 Aternatives, are too difficult for me,
 and because I don't want use a scheduler, because I need to start as 
 soon as possible,

 I will solve it so,
 that I will make 2 http accesses from my page: one with submit (will 
 validate/save the query to database) and one with ajax/javascript 
 (onSubmit 
 from the old page or better: onPageLoaded from the next page where I give 
 the query in .html DOM as some hidden value), which will start the z39 
 protocol/retrieve/parse/save data.
 This will be much better, because web2py in the ajax call will prepare 
 the db variable with proper db model for me (which otherwise I must handle 
 myselves in the separate thread).
 Callback from this ajax call should/could be some dummy javascript 
 function, because it is not sure, and not important, if the page still 
 exists when the server job will finish.

 So, if somebody is interesting and will read this very old thread, 
 maybe this can give him some idea for time consumming actions.
 And maybe somebody will add other important hints or comments (thanks 
 in advance).

 Dne středa 26. května 2010 0:33:02 UTC+2 Giuseppe Luca Scrofani 
 napsal(a):
>
> Hi all, as promised I'm here to prove you are patient and nice :)
> I' have to make this little app where there is a function that read
> the html content of several pages of another website (like a spider)
> and if a specified keyword is found the app refresh a page where there
> is the growing list of "match".
> Now, the spider part is already coded, is called search(), it uses
> twill to log in the target site, read the html of a list of pages,
> perform some searching procedures and keep adding the result to a
> list. I integrated this in a default.py controller and make a call in
> def index():
>>>

[web2py] Re: A basic problem about threading and time consuming function...

2016-05-03 Thread Mirek Zvolský

Hi, Niphlod.

After I have read something about scheduler,
I am definitively sorry for my previous notes
and I choose web2py scheduler of course.

It will be my first use of it (with much older ~3 years web2py app I have 
used cron only),
so it will take some time to learn with scheduler. But it is sure worth to 
redesign it so.

Thanks you are patient with me.
Mirek




Dne pondělí 2. května 2016 20:35:05 UTC+2 Mirek Zvolský napsal(a):
>
> You are right.
> At this time it works for me via ajax well and I will look carefully for 
> problems.
> If so, I will move to scheduler.
>
> I see this is exactly what Massimo(?) writes at the bottom of Ajax chapter 
> of the book.
>
> PS: about times:
> At notebook with mobile connection it takes 20-40s. So it could be danger.
> At cloud server with SSD it takes 2-10s. But this will be my case. And I 
> feel better when the user can have typical response in 3s instead in 8s.
>
>
>
>
>
> Dne neděle 1. května 2016 22:10:31 UTC+2 Niphlod napsal(a):
>>
>> the statement "I don't need to use the scheduler, because I want to start 
>> it as soon as possible" is flaky at best. If your "fetching" varies from 2 
>> to 20 seconds and COULD extend further to 60 seconds, waiting a few seconds 
>> for the scheduler to start the process is  uhm... debatable.
>> Of course relying to ajax if your "feching" can be killed in the process 
>> is the only other way.
>>
>> On Sunday, May 1, 2016 at 8:09:23 PM UTC+2, Mirek Zvolský wrote:
>>>
>>> Thanks for info and tips, 6 years later.
>>>
>>> What I try to do
>>> is a form with single input, where user gives a query string
>>> and then data about (usually ~300) books will be retrieved via z39 and 
>>> marc protocol/format, parsed and saved into local database.
>>>
>>> Of course this will take a time (2? 5? 20? seconds) and I decided
>>> not to show the result immediately,
>>> but show the same form with possibility to enter the next query + there 
>>> is a list of pending queries (and their status - via ajax testing every 5 
>>> seconds)
>>>
>>> So my idea was to provide a return from the controller fast and before 
>>> the return to start a new thread to retrieve/parse/save/commit data.
>>>
>>> From this discussion I understand that open new thread isn't best idea.
>>> I think it could be still possible, because if my new thread could be 
>>> killed 60s later from the web server together with the original thread - 
>>> such possibility is not fatal problem for me here.
>>>
>>> However when (as I read here) this would be a little wild technology,
>>> and because other technologies mentioned here: 
>>> https://en.wikipedia.org/wiki/Comet_(programming) -paragraph 
>>> Aternatives, are too difficult for me,
>>> and because I don't want use a scheduler, because I need to start as 
>>> soon as possible,
>>>
>>> I will solve it so,
>>> that I will make 2 http accesses from my page: one with submit (will 
>>> validate/save the query to database) and one with ajax/javascript (onSubmit 
>>> from the old page or better: onPageLoaded from the next page where I give 
>>> the query in .html DOM as some hidden value), which will start the z39 
>>> protocol/retrieve/parse/save data.
>>> This will be much better, because web2py in the ajax call will prepare 
>>> the db variable with proper db model for me (which otherwise I must handle 
>>> myselves in the separate thread).
>>> Callback from this ajax call should/could be some dummy javascript 
>>> function, because it is not sure, and not important, if the page still 
>>> exists when the server job will finish.
>>>
>>> So, if somebody is interesting and will read this very old thread, maybe 
>>> this can give him some idea for time consumming actions.
>>> And maybe somebody will add other important hints or comments (thanks in 
>>> advance).
>>>
>>>
>>>
>>>
>>>
>>>
>>> Dne středa 26. května 2010 0:33:02 UTC+2 Giuseppe Luca Scrofani 
>>> napsal(a):

 Hi all, as promised I'm here to prove you are patient and nice :)
 I' have to make this little app where there is a function that read
 the html content of several pages of another website (like a spider)
 and if a specified keyword is found the app refresh a page where there
 is the growing list of "match".
 Now, the spider part is already coded, is called search(), it uses
 twill to log in the target site, read the html of a list of pages,
 perform some searching procedures and keep adding the result to a
 list. I integrated this in a default.py controller and make a call in
 def index():
 This make the index.html page loading for a long time, because now it
 have to finish to scan all pages before return all results.
 What I want to achieve is to automatically refresh index every 2
 second to keep in touch with what is going on, seeing the list of
 match growing in "realtime". Even better, if I can use some sort of
 ajax magic to not refresh the entire page... but this is not vital, a
>

[web2py] Re: A basic problem about threading and time consuming function...

2016-05-02 Thread Mirek Zvolský

You are right.
At this time it works for me via ajax well and I will look carefully for 
problems.
If so, I will move to scheduler.

I see this is exactly what Massimo(?) writes at the bottom of Ajax chapter 
of the book.

PS: about times:
At notebook with mobile connection it takes 20-40s. So it could be danger.
At cloud server with SSD it takes 2-10s. But this will be my case. And I 
feel better when the user can have typical response in 3s instead in 8s.





Dne neděle 1. května 2016 22:10:31 UTC+2 Niphlod napsal(a):
>
> the statement "I don't need to use the scheduler, because I want to start 
> it as soon as possible" is flaky at best. If your "fetching" varies from 2 
> to 20 seconds and COULD extend further to 60 seconds, waiting a few seconds 
> for the scheduler to start the process is  uhm... debatable.
> Of course relying to ajax if your "feching" can be killed in the process 
> is the only other way.
>
> On Sunday, May 1, 2016 at 8:09:23 PM UTC+2, Mirek Zvolský wrote:
>>
>> Thanks for info and tips, 6 years later.
>>
>> What I try to do
>> is a form with single input, where user gives a query string
>> and then data about (usually ~300) books will be retrieved via z39 and 
>> marc protocol/format, parsed and saved into local database.
>>
>> Of course this will take a time (2? 5? 20? seconds) and I decided
>> not to show the result immediately,
>> but show the same form with possibility to enter the next query + there 
>> is a list of pending queries (and their status - via ajax testing every 5 
>> seconds)
>>
>> So my idea was to provide a return from the controller fast and before 
>> the return to start a new thread to retrieve/parse/save/commit data.
>>
>> From this discussion I understand that open new thread isn't best idea.
>> I think it could be still possible, because if my new thread could be 
>> killed 60s later from the web server together with the original thread - 
>> such possibility is not fatal problem for me here.
>>
>> However when (as I read here) this would be a little wild technology,
>> and because other technologies mentioned here: 
>> https://en.wikipedia.org/wiki/Comet_(programming) -paragraph 
>> Aternatives, are too difficult for me,
>> and because I don't want use a scheduler, because I need to start as soon 
>> as possible,
>>
>> I will solve it so,
>> that I will make 2 http accesses from my page: one with submit (will 
>> validate/save the query to database) and one with ajax/javascript (onSubmit 
>> from the old page or better: onPageLoaded from the next page where I give 
>> the query in .html DOM as some hidden value), which will start the z39 
>> protocol/retrieve/parse/save data.
>> This will be much better, because web2py in the ajax call will prepare 
>> the db variable with proper db model for me (which otherwise I must handle 
>> myselves in the separate thread).
>> Callback from this ajax call should/could be some dummy javascript 
>> function, because it is not sure, and not important, if the page still 
>> exists when the server job will finish.
>>
>> So, if somebody is interesting and will read this very old thread, maybe 
>> this can give him some idea for time consumming actions.
>> And maybe somebody will add other important hints or comments (thanks in 
>> advance).
>>
>>
>>
>>
>>
>>
>> Dne středa 26. května 2010 0:33:02 UTC+2 Giuseppe Luca Scrofani napsal(a):
>>>
>>> Hi all, as promised I'm here to prove you are patient and nice :)
>>> I' have to make this little app where there is a function that read
>>> the html content of several pages of another website (like a spider)
>>> and if a specified keyword is found the app refresh a page where there
>>> is the growing list of "match".
>>> Now, the spider part is already coded, is called search(), it uses
>>> twill to log in the target site, read the html of a list of pages,
>>> perform some searching procedures and keep adding the result to a
>>> list. I integrated this in a default.py controller and make a call in
>>> def index():
>>> This make the index.html page loading for a long time, because now it
>>> have to finish to scan all pages before return all results.
>>> What I want to achieve is to automatically refresh index every 2
>>> second to keep in touch with what is going on, seeing the list of
>>> match growing in "realtime". Even better, if I can use some sort of
>>> ajax magic to not refresh the entire page... but this is not vital, a
>>> simple page refresh would be sufficient.
>>> Question is: I have to use threading to solve this problem?
>>> Alternative solutions?
>>> I have to made the list of match a global to read it from another
>>> function? It would be simpler if I made it write a text file, adding a
>>> line for every match and reading it from the index controller? If I
>>> have to use thread it will run on GAE?
>>>
>>> Sorry for the long text and for my bad english :)
>>>
>>> gls
>>>
>>>

-- 
Resources:
- http://web2py.com
- http://web2py.com/book (Documentation)
- http

[web2py] Re: A basic problem about threading and time consuming function...

2016-05-01 Thread Niphlod

the statement "I don't need to use the scheduler, because I want to start 
it as soon as possible" is flaky at best. If your "fetching" varies from 2 
to 20 seconds and COULD extend further to 60 seconds, waiting a few seconds 
for the scheduler to start the process is  uhm... debatable.
Of course relying to ajax if your "feching" can be killed in the process is 
the only other way.

On Sunday, May 1, 2016 at 8:09:23 PM UTC+2, Mirek Zvolský wrote:
>
> Thanks for info and tips, 6 years later.
>
> What I try to do
> is a form with single input, where user gives a query string
> and then data about (usually ~300) books will be retrieved via z39 and 
> marc protocol/format, parsed and saved into local database.
>
> Of course this will take a time (2? 5? 20? seconds) and I decided
> not to show the result immediately,
> but show the same form with possibility to enter the next query + there is 
> a list of pending queries (and their status - via ajax testing every 5 
> seconds)
>
> So my idea was to provide a return from the controller fast and before the 
> return to start a new thread to retrieve/parse/save/commit data.
>
> From this discussion I understand that open new thread isn't best idea.
> I think it could be still possible, because if my new thread could be 
> killed 60s later from the web server together with the original thread - 
> such possibility is not fatal problem for me here.
>
> However when (as I read here) this would be a little wild technology,
> and because other technologies mentioned here: 
> https://en.wikipedia.org/wiki/Comet_(programming) -paragraph Aternatives, 
> are too difficult for me,
> and because I don't want use a scheduler, because I need to start as soon 
> as possible,
>
> I will solve it so,
> that I will make 2 http accesses from my page: one with submit (will 
> validate/save the query to database) and one with ajax/javascript (onSubmit 
> from the old page or better: onPageLoaded from the next page where I give 
> the query in .html DOM as some hidden value), which will start the z39 
> protocol/retrieve/parse/save data.
> This will be much better, because web2py in the ajax call will prepare the 
> db variable with proper db model for me (which otherwise I must handle 
> myselves in the separate thread).
> Callback from this ajax call should/could be some dummy javascript 
> function, because it is not sure, and not important, if the page still 
> exists when the server job will finish.
>
> So, if somebody is interesting and will read this very old thread, maybe 
> this can give him some idea for time consumming actions.
> And maybe somebody will add other important hints or comments (thanks in 
> advance).
>
>
>
>
>
>
> Dne středa 26. května 2010 0:33:02 UTC+2 Giuseppe Luca Scrofani napsal(a):
>>
>> Hi all, as promised I'm here to prove you are patient and nice :)
>> I' have to make this little app where there is a function that read
>> the html content of several pages of another website (like a spider)
>> and if a specified keyword is found the app refresh a page where there
>> is the growing list of "match".
>> Now, the spider part is already coded, is called search(), it uses
>> twill to log in the target site, read the html of a list of pages,
>> perform some searching procedures and keep adding the result to a
>> list. I integrated this in a default.py controller and make a call in
>> def index():
>> This make the index.html page loading for a long time, because now it
>> have to finish to scan all pages before return all results.
>> What I want to achieve is to automatically refresh index every 2
>> second to keep in touch with what is going on, seeing the list of
>> match growing in "realtime". Even better, if I can use some sort of
>> ajax magic to not refresh the entire page... but this is not vital, a
>> simple page refresh would be sufficient.
>> Question is: I have to use threading to solve this problem?
>> Alternative solutions?
>> I have to made the list of match a global to read it from another
>> function? It would be simpler if I made it write a text file, adding a
>> line for every match and reading it from the index controller? If I
>> have to use thread it will run on GAE?
>>
>> Sorry for the long text and for my bad english :)
>>
>> gls
>>
>>

-- 
Resources:
- http://web2py.com
- http://web2py.com/book (Documentation)
- http://github.com/web2py/web2py (Source code)
- https://code.google.com/p/web2py/issues/list (Report Issues)
--- 
You received this message because you are subscribed to the Google Groups 
"web2py-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to web2py+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[web2py] Re: A basic problem about threading and time consuming function...

2016-05-01 Thread Mirek Zvolský

Thanks for info and tips, 6 years later.

What I try to do
is a form with single input, where user gives a query string
and then data about (usually ~300) books will be retrieved via z39 and marc 
protocol/format, parsed and saved into local database.

Of course this will take a time (2? 5? 20? seconds) and I decided
not to show the result immediately,
but show the same form with possibility to enter the next query + there is 
a list of pending queries (and their status - via ajax testing every 5 
seconds)

So my idea was to provide a return from the controller fast and before the 
return to start a new thread to retrieve/parse/save/commit data.

>From this discussion I understand that open new thread isn't best idea.
I think it could be still possible, because if my new thread could be 
killed 60s later from the web server together with the original thread - 
such possibility is not fatal problem for me here.

However when (as I read here) this would be a little wild technology,
and because other technologies mentioned 
here: https://en.wikipedia.org/wiki/Comet_(programming) -paragraph 
Aternatives, are too difficult for me,
and because I don't want use a scheduler, because I need to start as soon 
as possible,

I will solve it so,
that I will make 2 http accesses from my page: one with submit (will 
validate/save the query to database) and one with ajax/javascript (onSubmit 
from the old page or better: onPageLoaded from the next page where I give 
the query in .html DOM as some hidden value), which will start the z39 
protocol/retrieve/parse/save data.
This will be much better, because web2py in the ajax call will prepare the 
db variable with proper db model for me (which otherwise I must handle 
myselves in the separate thread).
Callback from this ajax call should/could be some dummy javascript 
function, because it is not sure, and not important, if the page still 
exists when the server job will finish.

So, if somebody is interesting and will read this very old thread, maybe 
this can give him some idea for time consumming actions.
And maybe somebody will add other important hints or comments (thanks in 
advance).






Dne středa 26. května 2010 0:33:02 UTC+2 Giuseppe Luca Scrofani napsal(a):
>
> Hi all, as promised I'm here to prove you are patient and nice :)
> I' have to make this little app where there is a function that read
> the html content of several pages of another website (like a spider)
> and if a specified keyword is found the app refresh a page where there
> is the growing list of "match".
> Now, the spider part is already coded, is called search(), it uses
> twill to log in the target site, read the html of a list of pages,
> perform some searching procedures and keep adding the result to a
> list. I integrated this in a default.py controller and make a call in
> def index():
> This make the index.html page loading for a long time, because now it
> have to finish to scan all pages before return all results.
> What I want to achieve is to automatically refresh index every 2
> second to keep in touch with what is going on, seeing the list of
> match growing in "realtime". Even better, if I can use some sort of
> ajax magic to not refresh the entire page... but this is not vital, a
> simple page refresh would be sufficient.
> Question is: I have to use threading to solve this problem?
> Alternative solutions?
> I have to made the list of match a global to read it from another
> function? It would be simpler if I made it write a text file, adding a
> line for every match and reading it from the index controller? If I
> have to use thread it will run on GAE?
>
> Sorry for the long text and for my bad english :)
>
> gls
>
>

-- 
Resources:
- http://web2py.com
- http://web2py.com/book (Documentation)
- http://github.com/web2py/web2py (Source code)
- https://code.google.com/p/web2py/issues/list (Report Issues)
--- 
You received this message because you are subscribed to the Google Groups 
"web2py-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to web2py+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [web2py] Re: A basic problem about threading and time consuming function...

2010-06-02 Thread Doug Warren

I started looking at this a bit, you can find the specs for the Comet
protocol, such as it is at
http://svn.cometd.com/trunk/bayeux/bayeux.html It's built on top of
JSON but isn't quite JSON-RPC.  More of a publish/subscribe model.
The latest version of 'cometd' just released a beta release available
at http://download.cometd.org/ and includes a JavaScript library for
both dojo and jquery (Well it's written in/for dojo, but has a jquery
style interface as well.)  I started poking at it a bit, but I haven't
ever done any Jquery so it will probably be slow.

My plan is to require basic auth, and then put the client ID into the
session and to put any data on subscribed channels into that
connection, then just use long-polling to keep it open.

Since there's no real way for a web2py app to be notified of internal
state changes, I'm not sure long term how I would handle actually
looking for anything to send out over the long poll.  Though I've had
some thoughts of writing a scheduler for web2py with granularity of a
second or so.


On Tue, May 25, 2010 at 7:24 PM, Allard  wrote:
> Comet is a nice way to get this done but I wonder how to implement
> comet efficiently in web2py. Massimo, does web2py use a threadpool
> under the hood? For comet you would then quickly run out of threads.
> If you'd try to do this with a thread per connection things would get
> out of hand pretty quickly so the best way is doing the work
> asynchronously like Orbited. Alternatives would be using one of the
> contemporary Python asynchronous libraries. These libraries provide
> monkey patching of synchronous calls like your url fetching. Some
> suggestions:
>
> Gevent: now with support of Postgress, probably the fastest out there
> Eventlet: used at Lindenlab / Second Life
> Concurrence: with handy async mysql interface
> Tornado: full async webserver in Python
>
> Massimo: what do you think of an asynchronous model for web2py? It'd
> be great to to have asynchronous capabilities. I am writing an app
> that will require quite a bit of client initiated background
> processing (sending emails, resizing images) which I would rather hand
> off to a green thread and not block one the web2py threads. Curious
> about your thoughts.
>
> BTW - my first post here. Started to use for web2py for a community
> site and enjoy working in it a lot! Great work.
>
> On May 25, 9:39 pm, Candid  wrote:
>> Well, actually there is a way for the server to trigger an action in
>> the browser. It's called comet. Of course under the hood it's
>> implemented on top of http, so it's browser who initiates request, but
>> from the developer perspective it looks like there is dual channel
>> connection between the browser and the server, and they both can send
>> messages to each other asynchronously. There are several
>> implementation of comet technology. I've used Orbited (http://
>> orbited.org/) and it worked quite well for me.
>>
>> On May 25, 9:00 pm, mdipierro  wrote:
>>
>> > I would use a background process that does the work and adds the items
>> > to a database table. The index function would periodically refresh or
>> > pull an updated list via ajax from the database table. there is no way
>> > for te server to trigger an action in the browser unless 1) the
>> > browser initiates it or 2) the client code embeds an ajax http server.
>> > I would stay away from 1 and 2 and
>> > use reload of ajax.
>>
>> > On May 25, 5:33 pm, Giuseppe Luca Scrofani 
>> > wrote:
>>
>> > > Hi all, as promised I'm here to prove you are patient and nice :)
>> > > I' have to make this little app where there is a function that read
>> > > the html content of several pages of another website (like a spider)
>> > > and if a specified keyword is found the app refresh a page where there
>> > > is the growing list of "match".
>> > > Now, the spider part is already coded, is called search(), it uses
>> > > twill to log in the target site, read the html of a list of pages,
>> > > perform some searching procedures and keep adding the result to a
>> > > list. I integrated this in a default.py controller and make a call in
>> > > def index():
>> > > This make the index.html page loading for a long time, because now it
>> > > have to finish to scan all pages before return all results.
>> > > What I want to achieve is to automatically refresh index every 2
>> > > second to keep in touch with what is going on, seeing the list of
>> > > match growing in "realtime". Even better, if I can use some sort of
>> > > ajax magic to not refresh the entire page... but this is not vital, a
>> > > simple page refresh would be sufficient.
>> > > Question is: I have to use threading to solve this problem?
>> > > Alternative solutions?
>> > > I have to made the list of match a global to read it from another
>> > > function? It would be simpler if I made it write a text file, adding a
>> > > line for every match and reading it from the index controller? If I
>> > > have to use thread it will run on GAE?
>>

Re: [web2py] Re: A basic problem about threading and time consuming function...

2010-05-27 Thread Giuseppe Luca Scrofani

Thanks all for answering friends! I've extracted some good info from
this discussion, and the solution proposed by Massimo work well :)

[web2py] Re: A basic problem about threading and time consuming function...

2010-05-26 Thread Allard

I won't have time to work out a async proof of concept at this time. I
hope to get this after some more real world profiling with my web2py
app though. To give you an idea of how an async web framework could
feel as natural in programming style as web2py (eg. no call backs all
over the place), have a look at Concurrence documentation if you're
interested:
http://opensource.hyves.org/concurrence/index.html
To implement async for web2py is probably for the most part
straightforward (monkey patching all the IO). The trouble will be with
external libraries that block and can't be monkey patched. For example
db drivers. Maybe those blocking calls are best dealt with in a thread
pool and queue.

The idea of Comet is to keep the connection open to the client and
flow data as it becomes available:
http://en.wikipedia.org/wiki/Comet_%28programming%29
It saves the overhead of a client polling at intervals and
establishing the connection each time. In a thread per connection
model you would need to keep a thread available per client. A thread
per client can get expensive quickly and does not scale nicely. After
a few hundred connections most servers slow down dramatically because
of thread context switching. See also:
http://www.kegel.com/c10k.html

For most web apps a thread per connection (from a threadpool) won't be
a problem but for for things like Ajax email applications or chat / IM
it does get troublesome.

On May 25, 10:59 pm, mdipierro  wrote:
> On May 25, 9:24 pm, Allard  wrote:
>
> > Comet is a nice way to get this done but I wonder how to implement
> > comet efficiently in web2py.
>
> I have never used comet but I do not see any major problem
>
> > Massimo, does web2py use a threadpool
> > under the hood? For comet you would then quickly run out of threads.
>
> The web server creates a thread pool. for stand alone web2py that
> would be Rocket.
> You do not run out of them any more than any other web app.
>
>
>
> > If you'd try to do this with a thread per connection things would get
> > out of hand pretty quickly so the best way is doing the work
> > asynchronously like Orbited. Alternatives would be using one of the
> > contemporary Python asynchronous libraries. These libraries provide
> > monkey patching of synchronous calls like your url fetching. Some
> > suggestions:
>
> > Gevent: now with support of Postgress, probably the fastest out there
> > Eventlet: used at Lindenlab / Second Life
> > Concurrence: with handy async mysql interface
> > Tornado: full async webserver in Python
>
> > Massimo: what do you think of an asynchronous model for web2py? It'd
> > be great to to have asynchronous capabilities. I am writing an app
> > that will require quite a bit of client initiated background
> > processing (sending emails, resizing images) which I would rather hand
> > off to a green thread and not block one the web2py threads. Curious
> > about your thoughts.
>
> I do not think we can use async IO with web2py. async IO as far as I
> understand would require a different programming style.
> Anyway, if you have a working proof of concept I would like to see it.
>
> Massimo
>
>
>
> > BTW - my first post here. Started to use for web2py for a community
> > site and enjoy working in it a lot! Great work.
>
> > On May 25, 9:39 pm, Candid  wrote:
>
> > > Well, actually there is a way for the server to trigger an action in
> > > the browser. It's called comet. Of course under the hood it's
> > > implemented on top of http, so it's browser who initiates request, but
> > > from the developer perspective it looks like there is dual channel
> > > connection between the browser and the server, and they both can send
> > > messages to each other asynchronously. There are several
> > > implementation of comet technology. I've used Orbited (http://
> > > orbited.org/) and it worked quite well for me.
>
> > > On May 25, 9:00 pm, mdipierro  wrote:
>
> > > > I would use a background process that does the work and adds the items
> > > > to a database table. The index function would periodically refresh or
> > > > pull an updated list via ajax from the database table. there is no way
> > > > for te server to trigger an action in the browser unless 1) the
> > > > browser initiates it or 2) the client code embeds an ajax http server.
> > > > I would stay away from 1 and 2 and
> > > > use reload of ajax.
>
> > > > On May 25, 5:33 pm, Giuseppe Luca Scrofani 
> > > > wrote:
>
> > > > > Hi all, as promised I'm here to prove you are patient and nice :)
> > > > > I' have to make this little app where there is a function that read
> > > > > the html content of several pages of another website (like a spider)
> > > > > and if a specified keyword is found the app refresh a page where there
> > > > > is the growing list of "match".
> > > > > Now, the spider part is already coded, is called search(), it uses
> > > > > twill to log in the target site, read the html of a list of pages,
> > > > > perform some searching procedures and keep a

[web2py] Re: A basic problem about threading and time consuming function...

2010-05-25 Thread mdipierro

On May 25, 9:24 pm, Allard  wrote:
> Comet is a nice way to get this done but I wonder how to implement
> comet efficiently in web2py.

I have never used comet but I do not see any major problem

> Massimo, does web2py use a threadpool
> under the hood? For comet you would then quickly run out of threads.

The web server creates a thread pool. for stand alone web2py that
would be Rocket.
You do not run out of them any more than any other web app.

> If you'd try to do this with a thread per connection things would get
> out of hand pretty quickly so the best way is doing the work
> asynchronously like Orbited. Alternatives would be using one of the
> contemporary Python asynchronous libraries. These libraries provide
> monkey patching of synchronous calls like your url fetching. Some
> suggestions:
>
> Gevent: now with support of Postgress, probably the fastest out there
> Eventlet: used at Lindenlab / Second Life
> Concurrence: with handy async mysql interface
> Tornado: full async webserver in Python
>
> Massimo: what do you think of an asynchronous model for web2py? It'd
> be great to to have asynchronous capabilities. I am writing an app
> that will require quite a bit of client initiated background
> processing (sending emails, resizing images) which I would rather hand
> off to a green thread and not block one the web2py threads. Curious
> about your thoughts.

I do not think we can use async IO with web2py. async IO as far as I
understand would require a different programming style.
Anyway, if you have a working proof of concept I would like to see it.

Massimo

>
> BTW - my first post here. Started to use for web2py for a community
> site and enjoy working in it a lot! Great work.
>
> On May 25, 9:39 pm, Candid  wrote:
>
> > Well, actually there is a way for the server to trigger an action in
> > the browser. It's called comet. Of course under the hood it's
> > implemented on top of http, so it's browser who initiates request, but
> > from the developer perspective it looks like there is dual channel
> > connection between the browser and the server, and they both can send
> > messages to each other asynchronously. There are several
> > implementation of comet technology. I've used Orbited (http://
> > orbited.org/) and it worked quite well for me.
>
> > On May 25, 9:00 pm, mdipierro  wrote:
>
> > > I would use a background process that does the work and adds the items
> > > to a database table. The index function would periodically refresh or
> > > pull an updated list via ajax from the database table. there is no way
> > > for te server to trigger an action in the browser unless 1) the
> > > browser initiates it or 2) the client code embeds an ajax http server.
> > > I would stay away from 1 and 2 and
> > > use reload of ajax.
>
> > > On May 25, 5:33 pm, Giuseppe Luca Scrofani 
> > > wrote:
>
> > > > Hi all, as promised I'm here to prove you are patient and nice :)
> > > > I' have to make this little app where there is a function that read
> > > > the html content of several pages of another website (like a spider)
> > > > and if a specified keyword is found the app refresh a page where there
> > > > is the growing list of "match".
> > > > Now, the spider part is already coded, is called search(), it uses
> > > > twill to log in the target site, read the html of a list of pages,
> > > > perform some searching procedures and keep adding the result to a
> > > > list. I integrated this in a default.py controller and make a call in
> > > > def index():
> > > > This make the index.html page loading for a long time, because now it
> > > > have to finish to scan all pages before return all results.
> > > > What I want to achieve is to automatically refresh index every 2
> > > > second to keep in touch with what is going on, seeing the list of
> > > > match growing in "realtime". Even better, if I can use some sort of
> > > > ajax magic to not refresh the entire page... but this is not vital, a
> > > > simple page refresh would be sufficient.
> > > > Question is: I have to use threading to solve this problem?
> > > > Alternative solutions?
> > > > I have to made the list of match a global to read it from another
> > > > function? It would be simpler if I made it write a text file, adding a
> > > > line for every match and reading it from the index controller? If I
> > > > have to use thread it will run on GAE?
>
> > > > Sorry for the long text and for my bad english :)
>
> > > > gls

[web2py] Re: A basic problem about threading and time consuming function...

2010-05-25 Thread Allard

Comet is a nice way to get this done but I wonder how to implement
comet efficiently in web2py. Massimo, does web2py use a threadpool
under the hood? For comet you would then quickly run out of threads.
If you'd try to do this with a thread per connection things would get
out of hand pretty quickly so the best way is doing the work
asynchronously like Orbited. Alternatives would be using one of the
contemporary Python asynchronous libraries. These libraries provide
monkey patching of synchronous calls like your url fetching. Some
suggestions:

Gevent: now with support of Postgress, probably the fastest out there
Eventlet: used at Lindenlab / Second Life
Concurrence: with handy async mysql interface
Tornado: full async webserver in Python

Massimo: what do you think of an asynchronous model for web2py? It'd
be great to to have asynchronous capabilities. I am writing an app
that will require quite a bit of client initiated background
processing (sending emails, resizing images) which I would rather hand
off to a green thread and not block one the web2py threads. Curious
about your thoughts.

BTW - my first post here. Started to use for web2py for a community
site and enjoy working in it a lot! Great work.

On May 25, 9:39 pm, Candid  wrote:
> Well, actually there is a way for the server to trigger an action in
> the browser. It's called comet. Of course under the hood it's
> implemented on top of http, so it's browser who initiates request, but
> from the developer perspective it looks like there is dual channel
> connection between the browser and the server, and they both can send
> messages to each other asynchronously. There are several
> implementation of comet technology. I've used Orbited (http://
> orbited.org/) and it worked quite well for me.
>
> On May 25, 9:00 pm, mdipierro  wrote:
>
> > I would use a background process that does the work and adds the items
> > to a database table. The index function would periodically refresh or
> > pull an updated list via ajax from the database table. there is no way
> > for te server to trigger an action in the browser unless 1) the
> > browser initiates it or 2) the client code embeds an ajax http server.
> > I would stay away from 1 and 2 and
> > use reload of ajax.
>
> > On May 25, 5:33 pm, Giuseppe Luca Scrofani 
> > wrote:
>
> > > Hi all, as promised I'm here to prove you are patient and nice :)
> > > I' have to make this little app where there is a function that read
> > > the html content of several pages of another website (like a spider)
> > > and if a specified keyword is found the app refresh a page where there
> > > is the growing list of "match".
> > > Now, the spider part is already coded, is called search(), it uses
> > > twill to log in the target site, read the html of a list of pages,
> > > perform some searching procedures and keep adding the result to a
> > > list. I integrated this in a default.py controller and make a call in
> > > def index():
> > > This make the index.html page loading for a long time, because now it
> > > have to finish to scan all pages before return all results.
> > > What I want to achieve is to automatically refresh index every 2
> > > second to keep in touch with what is going on, seeing the list of
> > > match growing in "realtime". Even better, if I can use some sort of
> > > ajax magic to not refresh the entire page... but this is not vital, a
> > > simple page refresh would be sufficient.
> > > Question is: I have to use threading to solve this problem?
> > > Alternative solutions?
> > > I have to made the list of match a global to read it from another
> > > function? It would be simpler if I made it write a text file, adding a
> > > line for every match and reading it from the index controller? If I
> > > have to use thread it will run on GAE?
>
> > > Sorry for the long text and for my bad english :)
>
> > > gls

[web2py] Re: A basic problem about threading and time consuming function...

2010-05-25 Thread Allard

It seems like Comet would be hard to implement in web2py. Does web2py
use a threadpool internally? If so, I can see you run out of threads
pretty quickly. Ideally you would like to solve these kind of problems
with an asynchronous model (think Gevent, Eventlet, Concurrence,
Toranado). I am working on a project which requires a lot of slow
processing (image resizing, sending emails) based on client initiated
calls. Massimo, have you considered an asynchronous model within
web2py? Curious about your thoughts on it. I would much rather handle
the long running tasks in a green thread then to block a complete
thread.

My first post here and just started to work with web2py on a social
site. Great work Massimo! Batteries included but still light.

On May 25, 9:39 pm, Candid  wrote:
> Well, actually there is a way for the server to trigger an action in
> the browser. It's called comet. Of course under the hood it's
> implemented on top of http, so it's browser who initiates request, but
> from the developer perspective it looks like there is dual channel
> connection between the browser and the server, and they both can send
> messages to each other asynchronously. There are several
> implementation of comet technology. I've used Orbited (http://
> orbited.org/) and it worked quite well for me.
>
> On May 25, 9:00 pm, mdipierro  wrote:
>
> > I would use a background process that does the work and adds the items
> > to a database table. The index function would periodically refresh or
> > pull an updated list via ajax from the database table. there is no way
> > for te server to trigger an action in the browser unless 1) the
> > browser initiates it or 2) the client code embeds an ajax http server.
> > I would stay away from 1 and 2 and
> > use reload of ajax.
>
> > On May 25, 5:33 pm, Giuseppe Luca Scrofani 
> > wrote:
>
> > > Hi all, as promised I'm here to prove you are patient and nice :)
> > > I' have to make this little app where there is a function that read
> > > the html content of several pages of another website (like a spider)
> > > and if a specified keyword is found the app refresh a page where there
> > > is the growing list of "match".
> > > Now, the spider part is already coded, is called search(), it uses
> > > twill to log in the target site, read the html of a list of pages,
> > > perform some searching procedures and keep adding the result to a
> > > list. I integrated this in a default.py controller and make a call in
> > > def index():
> > > This make the index.html page loading for a long time, because now it
> > > have to finish to scan all pages before return all results.
> > > What I want to achieve is to automatically refresh index every 2
> > > second to keep in touch with what is going on, seeing the list of
> > > match growing in "realtime". Even better, if I can use some sort of
> > > ajax magic to not refresh the entire page... but this is not vital, a
> > > simple page refresh would be sufficient.
> > > Question is: I have to use threading to solve this problem?
> > > Alternative solutions?
> > > I have to made the list of match a global to read it from another
> > > function? It would be simpler if I made it write a text file, adding a
> > > line for every match and reading it from the index controller? If I
> > > have to use thread it will run on GAE?
>
> > > Sorry for the long text and for my bad english :)
>
> > > gls

[web2py] Re: A basic problem about threading and time consuming function...

2010-05-25 Thread Candid

Well, actually there is a way for the server to trigger an action in
the browser. It's called comet. Of course under the hood it's
implemented on top of http, so it's browser who initiates request, but
from the developer perspective it looks like there is dual channel
connection between the browser and the server, and they both can send
messages to each other asynchronously. There are several
implementation of comet technology. I've used Orbited (http://
orbited.org/) and it worked quite well for me.

On May 25, 9:00 pm, mdipierro  wrote:
> I would use a background process that does the work and adds the items
> to a database table. The index function would periodically refresh or
> pull an updated list via ajax from the database table. there is no way
> for te server to trigger an action in the browser unless 1) the
> browser initiates it or 2) the client code embeds an ajax http server.
> I would stay away from 1 and 2 and
> use reload of ajax.
>
> On May 25, 5:33 pm, Giuseppe Luca Scrofani 
> wrote:
>
>
>
> > Hi all, as promised I'm here to prove you are patient and nice :)
> > I' have to make this little app where there is a function that read
> > the html content of several pages of another website (like a spider)
> > and if a specified keyword is found the app refresh a page where there
> > is the growing list of "match".
> > Now, the spider part is already coded, is called search(), it uses
> > twill to log in the target site, read the html of a list of pages,
> > perform some searching procedures and keep adding the result to a
> > list. I integrated this in a default.py controller and make a call in
> > def index():
> > This make the index.html page loading for a long time, because now it
> > have to finish to scan all pages before return all results.
> > What I want to achieve is to automatically refresh index every 2
> > second to keep in touch with what is going on, seeing the list of
> > match growing in "realtime". Even better, if I can use some sort of
> > ajax magic to not refresh the entire page... but this is not vital, a
> > simple page refresh would be sufficient.
> > Question is: I have to use threading to solve this problem?
> > Alternative solutions?
> > I have to made the list of match a global to read it from another
> > function? It would be simpler if I made it write a text file, adding a
> > line for every match and reading it from the index controller? If I
> > have to use thread it will run on GAE?
>
> > Sorry for the long text and for my bad english :)
>
> > gls

[web2py] Re: A basic problem about threading and time consuming function...

2010-05-25 Thread mdipierro

I would use a background process that does the work and adds the items
to a database table. The index function would periodically refresh or
pull an updated list via ajax from the database table. there is no way
for te server to trigger an action in the browser unless 1) the
browser initiates it or 2) the client code embeds an ajax http server.
I would stay away from 1 and 2 and
use reload of ajax.

On May 25, 5:33 pm, Giuseppe Luca Scrofani 
wrote:
> Hi all, as promised I'm here to prove you are patient and nice :)
> I' have to make this little app where there is a function that read
> the html content of several pages of another website (like a spider)
> and if a specified keyword is found the app refresh a page where there
> is the growing list of "match".
> Now, the spider part is already coded, is called search(), it uses
> twill to log in the target site, read the html of a list of pages,
> perform some searching procedures and keep adding the result to a
> list. I integrated this in a default.py controller and make a call in
> def index():
> This make the index.html page loading for a long time, because now it
> have to finish to scan all pages before return all results.
> What I want to achieve is to automatically refresh index every 2
> second to keep in touch with what is going on, seeing the list of
> match growing in "realtime". Even better, if I can use some sort of
> ajax magic to not refresh the entire page... but this is not vital, a
> simple page refresh would be sufficient.
> Question is: I have to use threading to solve this problem?
> Alternative solutions?
> I have to made the list of match a global to read it from another
> function? It would be simpler if I made it write a text file, adding a
> line for every match and reading it from the index controller? If I
> have to use thread it will run on GAE?
>
> Sorry for the long text and for my bad english :)
>
> gls

[web2py] Re: A basic problem about threading and time consuming function...

[web2py] Re: A basic problem about threading and time consuming function...

[web2py] Re: A basic problem about threading and time consuming function...

[web2py] Re: A basic problem about threading and time consuming function...

[web2py] Re: A basic problem about threading and time consuming function...

[web2py] Re: A basic problem about threading and time consuming function...

[web2py] Re: A basic problem about threading and time consuming function...

Re: [web2py] Re: A basic problem about threading and time consuming function...

Re: [web2py] Re: A basic problem about threading and time consuming function...

[web2py] Re: A basic problem about threading and time consuming function...

[web2py] Re: A basic problem about threading and time consuming function...

[web2py] Re: A basic problem about threading and time consuming function...

[web2py] Re: A basic problem about threading and time consuming function...

[web2py] Re: A basic problem about threading and time consuming function...

[web2py] Re: A basic problem about threading and time consuming function...

15 matches

Site Navigation

Mail list logo

Footer information