Thanks, Guy, I'll try this debugging. Howeer, befor that I want to make
sure I'm using the correct mechanism for timeout.
For example, within the thread that needs to wait 10 seconds, the sample
code is as follows:
struct timespec tmspec;
int retcode;

Clock_gettime(CLOCK_REALTIME, &tmspec);
Tmspec.tv_sec += (seconds to wait until timeout).
Then within the loop: lock mutex, retcode =
pthread_cond_timedwait(&cond, &mutex, &tmspec);
Unlock mutex and then check if retcode == ETIMEDOUT, process on.
Does this seem to be corrct, specifically the seconds I add to
tmspec.tv_sec?
Thanks, Rafi.

-----Original Message-----
From: guy keren [mailto:[EMAIL PROTECTED] 
Sent: Sunday, November 11, 2007 12:23 AM
To: Rafi Cohen
Cc: 'Gilad Ben-Yossef'; linux-il@cs.huji.ac.il
Subject: Re: concurrent timers on linux



i think you have a simple bug in your code that causes the behaviour 
you're talking about.

i would suggest that, as an exercise, you write a program with only 2 
threads, have one of them wait (with pthread_cond_timedwait) for 10 
seconds and then print a message with the thread ID and current time, 
and the second thread wait for 15 seconds and print a similar message. 
run this code in a loop, and see if you can get the threads to work as 
expected (i.e. one prints a message every 10 seconds, the other prints 
it every 15 seconds).

i guess that if you get this working properly, you'll be able to see why

your program is not working as expected.

--guy

Rafi Cohen wrote:
> Hi Gilad, first, thanks for your efforts to help.
> I'll try to give a brief explanation of what I'm trying to do. Well, 
> I'll not use the word "timer, but "timeout". In any case, I'll be glad

> to hear from you if, still, timers were not the correct wording here.
> I'm working as a freelancer for a company involved in cellular
> communications and I was asked to write a kind of connections manager
> application.
> My application has to be able to manage concurrent (parallel)
> connections to some multicell systems, which may be roughly defined as
> celllular centrals.
> Those multicell systems exchange messages with my application for some
> tasks to be done by my application. In addition, there is a kind of
> software (let's call it client software) that knows to communicate
with
> those multicell systems through my application. So, basically there
may
> be such client software used on various other computers at the same
time
> and should be connected to my application concurrently.
> And yet also, my application has to connect to an ftp server to upload
> some kind of files.
> As I said, all connections have to be concurrent and not blocking
> others.
> I chose for this the multithread approach, where each connection is a
> thread.
> Yeah, I know, many people would advise using non-blocking sockets
here,
> but this is my weak part and I was in ahurry, so I decided to postpone
> this learning curve to a later occasion and chose multithreads.
> Why timeouts? For example, a timeout to block the ftp thread, which
> after it there is a reconnection to ftp server for upload.
> In case there is no connection to any of the multicell systems, a
> timeout to wait and then retry connection again.
> In case of connection with what I called client software, I have a
very
> simple ling-pong protocol to make sure there is connection between
this
> software and my application and again after my application sends a
> "ping" it should wait for an adjustable time (by default 10 seconds)
for
> the pong reply.
> So, as you see, various timeouts and all of them should be able to
> proceed cuncurrently and unaffected by any other one.
> Unfortunately, what I see in my case is that the most recent timeout
> "takes the lead" on all other blocked threads at that time and affects
> their timeouts.
> So for example, if the recent timeout was the one for the ftp and it
is
> for a couple of hours, all the other threads that were blocked for
their
> timeouts, remain blocked all along the last one.
> And here lies my problem. If you say that the system has a single
timer
> than that may well be the problem, and each thread needs to have it's
> own independent timer (timer again?).
> So, I do think there is a problem with my strategy.
> If you say thet pthread_cond_timedwait blocks the whole system, this
is
> bad. I intended to block a single thread, and that's what I thought it
> should do.
> Concerning clock_gettime function, there is an alternative of
> CLOCK_THREAD_CPUTIME_ID as it's first argument. I did not try it yet,
> but may be this could lead to a better solution.
> Ah, very lengthy message, still clear I hope. I'll be glad to continue
> and receive assistance.
> Thanks Gilad, Rafi.
> 
> -----Original Message-----
> From: Gilad Ben-Yossef [mailto:[EMAIL PROTECTED]
> Sent: Saturday, November 10, 2007 10:23 AM
> To: Rafi Cohen
> Cc: linux-il@cs.huji.ac.il
> Subject: Re: concurrent timers on linux
> 
> 
> Rafi Cohen wrote:
> 
>> My application is a multithread one, for which each thread has it's
>> own
>> timer and tasks upon timeout. Each timer may be different (varies
from
> 
>> 10 seconds in one case to 24 hours in another).
>> Each timer should also be independent and the threads should run
>> concurrently withou actually affecting other threads and timers.
> 
> POSIX only supply a single timer counting in real time for each 
> process.
> 
>   Multiple timers are either created by the programmer based on this
> single timer using a timer heap or the same done by the threading 
> library pthread create_timer.
> 
>> It seems that there still is some effect of some timers to others and
>> I
>> don't achieve the goal of total independency of timers.
> 
> There is just one timer.
> 
>> Here is the strategy I decided to use, by the way, my code is in C:
>> Each thread uses in a loop the function pthread_cond_timedwait. Part 
>> of them may exit either upon signal or time out and part of them 
>> actually waits until timeout.
> 
> This is not a timer. It's a blocking system call with a timeout.
> 
>> The problem I think I see is the timer adjustment prior to using 
>> pthread_cond_timedwait.
> 
> What timer? you're using a blocking system call with a timeout.
> 
>> I sould also note that each thread has it's own and unique mutex and 
>> condition variables for this function. The time adjustment is done by

>> calling clock_gettime with
> CLOCK_REALTIME
>> as it's first parameter.
>> This is called before each use of pthread_cond_timedwait and
> immediately
>> after call to clock_gettime I add the timeout seconds to the tv_sec
>> field of the timespec structure.
>> Now, to my questions:
>> 1. Does this strategy seem correct, if not please give other ideas.
> 
> That's how it is supposed to be used
> 
>> 2. Specifically, is the function clock_gettime the correct one to 
>> use, should it's first parameter be the one I use and should it be 
>> called indeed before each pthread_cond_timedwait within the loop, or 
>> only
> once
>> before the loop.
> 
> Anyway, clock_gettime is the correct one. CLOCK_REALTIME is correct 
> and
> you need to call it only once before entering the loop.
> 
> The man page has an excellent code example code:
> 
> http://linux.die.net/man/3/pthread_cond_timedwait
> 
> Search for "Timed Condition Wait"
> 
>> I'll be glad to have detailed ideas, in case you think I'm wrong 
>> here.
> 
> I don't think I understand what you're trying to achieve exactly.
> 
>> 3. Shachar (Shemes), you pointed me once to the libevent library as 
>> an alternative. I looked into this library and was very willing to 
>> use it. However, I understood from it's documentation that it is not 
>> threadsafe. Therefore, it seems not to be the right idea to use it 
>> for concurrent timers.
> 
> Timers?
> "You keep saying that word. I don't think you know what it means..." 
> :-)
> 
>> Am I wrong here? I'll be glad to stand corrected and use such option 
>> instead of the strategy I mentioned above. Any assistance will be 
>> most appreciated.
> 
> A description of what you are trying to do will do wonders here, I 
> think.
> 
> And why threads, anyway?
> 
>> Note: I raise programming issues here from time to time and get good 
>> answers most of the time. However, probably there are more linux 
>> users
> 
>> than programmers here. So, if you think there should be a better 
>> forum
> 
>> or mailing list to raise such questions, then please let me know. I 
>> do think however, that there are here really knowledgeable people
> that
>> may and I believe will help the best they can.
> 
> This happens to be the most technical Linux oriented mailing list in 
> Israel.
> 
> gilad
> 



-- 
No virus found in this incoming message.
Checked by AVG Free Edition. 
Version: 7.5.503 / Virus Database: 269.15.27/1121 - Release Date:
11/9/2007 7:29 PM



=================================================================
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word "unsubscribe" in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]

Reply via email to