Re: codec - thread safe

2007-10-09 Thread sebb
On 09/10/2007, Qingtian Wang [EMAIL PROTECTED] wrote:
 On 10/8/07, sebb [EMAIL PROTECTED] wrote:
  Which methods do you actually need?
 
  If you only need BCodec, then that (and Base64 which it calls) look to
  be thread-safe, so you only need to instantiate it once for each
  different charset.

 Yes, I sort of figured that out for myself already when I first
 started the conversation with Henri. Now thanks for your help, I got
 another pair of eyes confirming this. So I am good in my case.

 But all the discussion has been more about the in general case. I
 just feel that,

 1. Codec as a commons library, it should not be this hard to find out
 about information like this. It should be either this or that, find
 out for yourself is no good situation. As someone else pointed out
 earlier, we could use a better documentation.

Agreed. It should be stated clearly whether or not the code is thread-safe.

 2. It'd be nice for the biz method implementations to be thread safe
 (Ideally in a high performance manner as a value add-on of using a
 commons brand library such that user doesn't have to be too creative
 as some of the suggestions given in this discussion to achieve
 performance). Most of them may already be thread safe. And as it seems
 agreed by all that it's not hard to make them that way if not.

For a library such as codec, I agree that it should be thread-safe by
design. Where this is not possible, the unsafe classes should be
clearly identified.

I don't think it makes sense to separate methods into biz and the rest.

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: codec - thread safe

2007-10-09 Thread Will Pugh

sebb wrote:

On 09/10/2007, Qingtian Wang [EMAIL PROTECTED] wrote:
  

On 10/8/07, sebb [EMAIL PROTECTED] wrote:


Which methods do you actually need?

If you only need BCodec, then that (and Base64 which it calls) look to
be thread-safe, so you only need to instantiate it once for each
different charset.
  

Yes, I sort of figured that out for myself already when I first
started the conversation with Henri. Now thanks for your help, I got
another pair of eyes confirming this. So I am good in my case.

But all the discussion has been more about the in general case. I
just feel that,

1. Codec as a commons library, it should not be this hard to find out
about information like this. It should be either this or that, find
out for yourself is no good situation. As someone else pointed out
earlier, we could use a better documentation.



Agreed. It should be stated clearly whether or not the code is thread-safe.
  

I attached a patch to this bug.

My solution here was to add static methods called
   createThreadSafeCodec

on all the classes that are thread safe.  For a few, this simply returns 
an instance of the object.  For others, it returns an instance of the 
object that overrides the dangerous set* methods.  That way, if anyone 
tried to use it in an unsafe manner, they will deterministically get an 
error.


My patch also adds a blurb in package.html basically saying this.
  

2. It'd be nice for the biz method implementations to be thread safe
(Ideally in a high performance manner as a value add-on of using a
commons brand library such that user doesn't have to be too creative
as some of the suggestions given in this discussion to achieve
performance). Most of them may already be thread safe. And as it seems
agreed by all that it's not hard to make them that way if not.



For a library such as codec, I agree that it should be thread-safe by
design. Where this is not possible, the unsafe classes should be
clearly identified.

I don't think it makes sense to separate methods into biz and the rest.
  

I agree.

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

  


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: codec - thread safe

2007-10-08 Thread simon
On Sun, 2007-10-07 at 15:23 -0500, Qingtian Wang wrote:
 Ok, I got the point.
 
 So let's say I wanted to work on this. What's the most effective way to do it?
 
 Search the entire code base line by line trying to ID all the thread
 unsafe points by myself? I guess that's very ineffective compared to
 have an issue open, and have the individual developers who write the
 code to address it - They know what needs to be tweaked without having
 to even spend any time since it's their own code. Or at least, the dev
 team as a whole can come up a list of points that need to be worked
 on. I think that'd be much more effective. Any established channel
 where that can happen?

I would suggest that you just check the classes/methods that you
yourself want to be threadsafe, and make any necessary fixes.

There's no obligation to fix anything else; if somebody needs
thread-safety improvements in a different part, then *they* can do it.

And then piece by piece the software gets better for everyone, using a
process that is both cooperative and fair.

Agreed, this doesn't work for major infrastructure type work, but that
is not the case here; just a method or a class needs to be checked for
safety and maybe a few internal synchronization commands added. Codec
really is pretty simple stuff; it's not necessary to have written the
code in order to understand it.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: codec - thread safe

2007-10-08 Thread sebb
On 08/10/2007, Qingtian Wang [EMAIL PROTECTED] wrote:
 On 10/7/07, sebb [EMAIL PROTECTED] wrote:
  On 07/10/2007, Qingtian Wang [EMAIL PROTECTED] wrote:
   Ok, I got the point.
  
   So let's say I wanted to work on this. What's the most effective way to 
   do it?
 
  You could try running some of the code checking utilities on the library.
 
  For example Findbugs and PMD.
 
  I've done a quick check with FIndbugs, but that has not found any
  synchronisation issues. This is because there are no synchronised
  methods, so it cannot find any inconsistencies.
 
  But it should be possible to configure PMD to find any instance or
  static variables that are modified after construction.
 
  Or indeed you could perhaps try making all such variables final, and
  see whether it's possible to fix the compiler errors - but that might
  be a lot of work.
 
   Search the entire code base line by line trying to ID all the thread
   unsafe points by myself? I guess that's very ineffective compared to
   have an issue open, and have the individual developers who write the
   code to address it - They know what needs to be tweaked without having
   to even spend any time since it's their own code. Or at least, the dev
   team as a whole can come up a list of points that need to be worked
 
  Not necessarily - the code may have been developed over a long period,
  and the original authors may not longer be around. Unless the code has
  been worked on recently, it's possible that no-one has much detailed
  knowledge of the code.

 Wow, if that is really the case with commons codec that no one has
 detailed knowledge of the code, it would suggest to me that no one
 should use it as a off the shell usable product since no one really
 knows what it would do.

I did not write specifically about Commons Codec - I was just pointing
out that any software that has been developed over a long period is
likely to have areas that are unfamiliar to some  or all. Indeed, even
if you wrote the software yourself, would you still be familiar with
it several years later? Or even several months?

This is not just a software problem. Any large or long-lived project
will have areas that are not immediately familiar to anyone.

 I mean, it's one thing to say Here's what our intention what this
 product behaves. But use it at your own risk since there is no
 guarantee of any kind., but it's entirely different thing to say
 Here's a bunch of code, but we don't know how it behaves, you're
 welcome to find out for yourself and play around with it.

That's not the case here.
Does Commons Codec offer any thread-safe guarantees?
If not, then yes, one cannot say how it may behave in multi-threaded mode.
But that does not mean the behaviour is unpredictable in single-threaded usage.

For example, Java's SimpleDateFormat is not thread-safe.
Can one predict exactly what happens in multi-threaded mode?
Probably not.


 To me those two things put a product at two rather different level of
 maturity. If codec falls in the latter, a normal user that needs a
 codec library for their work might as well look some where else or
 roll their own stuff to hand their codec needs, since it'd take same
 or even more time to dig in commons-codec and try to find out what
 exactly it's trying to do.

Not so with Commons Codec, since it is quite a small library.

 -Q




 
   on. I think that'd be much more effective. Any established channel
   where that can happen?
 
  If you find some bugs, then report them via JIRA.
  It helps if you can provide test cases; obviously patches as well is
  even better.
 
   Thanks,
   -Q
  
  
  
   On 10/7/07, Torsten Curdt [EMAIL PROTECTED] wrote:
 Can the dev team make that happen? - a humble request from a user.

 The think about open source is that there is no distinction between
 developer and user.

 That's interesting. Why then almost all open source projects have
 users doc vs. developer doc, users mailing list vs. developers
 mailing list?
   
Point taken. But this is more about the presentation of information.
Still the distinction is blurry as many developers and contributors
come initially from the user background. They felt the itch to
improve the code and ended up providing the fix/feature
themselves ...I guess we are trying to get you to do the same :) You
should see that more as an invitation to help.
   
 The developers develop because they want to use the
 code. And when somebody wants to use a feature that doesn't exist,
 then
 they can develop it.

 In short, if you want this feature, you can do this yourself and
 post a
 patch to this list so it gets included in the next release. There
 is no
 paid dev team for any of the commons code.

 OK i see. My intention was just to bring up an issue/request. There is
 no saying the developers are obligated to work on it. And contributing
 code is certainly 

Re: codec - thread safe

2007-10-08 Thread David J. Biesack
 Date: Sat, 6 Oct 2007 23:31:19 -0500
 From: Qingtian Wang [EMAIL PROTECTED]
 
 Well, it's pick-your-poison kind of a deal. Either block on one
 instance and take a performance hit, or burn up the memory with lots
 of instances.

Why not compromise? Create a ThreadPoolExecutor of some reasonable size and 
submit a FutureTask
to run the encode or decode, and then do a future.get() when the value is 
needed. You might
even get some concurrency if you can start the encode, then continue doing 
other work
until you need the result.

-- 
David J. Biesack SAS Institute Inc.
(919) 531-7771   SAS Campus Drive
http://www.sas.com   Cary, NC 27513


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: codec - thread safe

2007-10-08 Thread Jörg Schaible
David J. Biesack wrote on Monday, October 08, 2007 3:02 PM:

 Date: Sat, 6 Oct 2007 23:31:19 -0500
 From: Qingtian Wang [EMAIL PROTECTED]
 
 Well, it's pick-your-poison kind of a deal. Either block on one
 instance and take a performance hit, or burn up the memory with lots
 of instances.
 
 Why not compromise? Create a ThreadPoolExecutor of some
 reasonable size and submit a FutureTask
 to run the encode or decode, and then do a future.get() when
 the value is needed. You might
 even get some concurrency if you can start the encode, then continue
 doing other work until you need the result.

Because it runs on JDK 1.3? However, that's the reason why I argumented not to 
promise thread-safety for any codec and provide synchronization wrappers or a 
user might take the pool approach ... which is quite easy with JDK 5 as you've 
provided here :)

- Jörg

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: codec - thread safe

2007-10-08 Thread David J. Biesack
 Date: Mon, 8 Oct 2007 16:23:59 +0200
 From: =?iso-8859-1?Q?J=F6rg_Schaible?= [EMAIL PROTECTED]
 
 David J. Biesack wrote on Monday, October 08, 2007 3:02 PM:
 
  Date: Sat, 6 Oct 2007 23:31:19 -0500
  From: Qingtian Wang [EMAIL PROTECTED]
  
  Well, it's pick-your-poison kind of a deal. Either block on one
  instance and take a performance hit, or burn up the memory with lots
  of instances.
  
  Why not compromise? Create a ThreadPoolExecutor ...
 
 Because it runs on JDK 1.3?

The java.util.concurrent backport http://backport-jsr166.sourceforge.net/ runs 
on 1.3, for just this kind of use.

 However, that's the reason why I argumented not to promise thread-safety for 
 any codec and provide synchronization wrappers or a user might take the pool 
 approach ... which is quite easy with JDK 5 as you've provided here :)

I think it makes a lot of sense to document the thread safety attributes of 
Commons libraries.

 - Jörg

-- 
David J. Biesack SAS Institute Inc.
(919) 531-7771   SAS Campus Drive
http://www.sas.com   Cary, NC 27513


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: codec - thread safe

2007-10-08 Thread Qingtian Wang
On 10/8/07, sebb [EMAIL PROTECTED] wrote:
 On 08/10/2007, Qingtian Wang [EMAIL PROTECTED] wrote:
  On 10/7/07, sebb [EMAIL PROTECTED] wrote:
   On 07/10/2007, Qingtian Wang [EMAIL PROTECTED] wrote:
Ok, I got the point.
   
So let's say I wanted to work on this. What's the most effective way to 
do it?
  
   You could try running some of the code checking utilities on the library.
  
   For example Findbugs and PMD.
  
   I've done a quick check with FIndbugs, but that has not found any
   synchronisation issues. This is because there are no synchronised
   methods, so it cannot find any inconsistencies.
  
   But it should be possible to configure PMD to find any instance or
   static variables that are modified after construction.
  
   Or indeed you could perhaps try making all such variables final, and
   see whether it's possible to fix the compiler errors - but that might
   be a lot of work.
  
Search the entire code base line by line trying to ID all the thread
unsafe points by myself? I guess that's very ineffective compared to
have an issue open, and have the individual developers who write the
code to address it - They know what needs to be tweaked without having
to even spend any time since it's their own code. Or at least, the dev
team as a whole can come up a list of points that need to be worked
  
   Not necessarily - the code may have been developed over a long period,
   and the original authors may not longer be around. Unless the code has
   been worked on recently, it's possible that no-one has much detailed
   knowledge of the code.
 
  Wow, if that is really the case with commons codec that no one has
  detailed knowledge of the code, it would suggest to me that no one
  should use it as a off the shell usable product since no one really
  knows what it would do.

 I did not write specifically about Commons Codec - I was just pointing
 out that any software that has been developed over a long period is
 likely to have areas that are unfamiliar to some  or all. Indeed, even
 if you wrote the software yourself, would you still be familiar with
 it several years later? Or even several months?

No, admitted I wouldn't. But I do believe, if I am still physically
around, it's easier for me to get back the code I wrote, and ask
myself what the heck was I thinking? than someone tries to do the
same.


 This is not just a software problem. Any large or long-lived project
 will have areas that are not immediately familiar to anyone.

Agreed. But at the same time, trying to fully utilize the available
knowledge base to start with might be beneficial. That's all I am
trying to say.



  I mean, it's one thing to say Here's what our intention what this
  product behaves. But use it at your own risk since there is no
  guarantee of any kind., but it's entirely different thing to say
  Here's a bunch of code, but we don't know how it behaves, you're
  welcome to find out for yourself and play around with it.

 That's not the case here.
 Does Commons Codec offer any thread-safe guarantees?
 If not, then yes, one cannot say how it may behave in multi-threaded mode.
 But that does not mean the behaviour is unpredictable in single-threaded 
 usage.

That's exactly what I was trying to figure out: Is thread-safe an
intention in general for the code base?

And after I found out that the answer is No, and worse, it seems
some methods are thread-safe and others are not, and nobody knows
what's what. That's where I made an issue out of it: Since no body
even knows for sure which is and which not, can we at least try to be
able to say for sure on the existing code base in terms of
thread-safety.

And I'd say before we can be sure of that, tell the users up front: No
thread-safety is considered. To me, that's much better than saying
nothing.



 For example, Java's SimpleDateFormat is not thread-safe.
 Can one predict exactly what happens in multi-threaded mode?
 Probably not.

That is actually a very good example: In the Javadoc of the that
class, Sun says it loud and clear: This not thread-safe. My wish is
simply that codec is able to give those kind of statements, and then
make the main business methods such as encode/decode thread safe,
rather than saying nothing and leave it for the users to find out.



  To me those two things put a product at two rather different level of
  maturity. If codec falls in the latter, a normal user that needs a
  codec library for their work might as well look some where else or
  roll their own stuff to hand their codec needs, since it'd take same
  or even more time to dig in commons-codec and try to find out what
  exactly it's trying to do.

 Not so with Commons Codec, since it is quite a small library.

Well big or small is rather subjective: For a slow developer like
myself in a rush to find a codec library for my work, it's big enough.
:)



  -Q
 
 
 
 
  
on. I think that'd be much more effective. Any established channel

RE: codec - thread safe

2007-10-08 Thread Jörg Schaible
David J. Biesack wrote on Monday, October 08, 2007 4:40 PM:

 Date: Mon, 8 Oct 2007 16:23:59 +0200
 From: =?iso-8859-1?Q?J=F6rg_Schaible?=
 [EMAIL PROTECTED]
 
 David J. Biesack wrote on Monday, October 08, 2007 3:02 PM:
 
 Date: Sat, 6 Oct 2007 23:31:19 -0500
 From: Qingtian Wang [EMAIL PROTECTED]
 
 Well, it's pick-your-poison kind of a deal. Either block on one
 instance and take a performance hit, or burn up the memory with
 lots of instances.
 
 Why not compromise? Create a ThreadPoolExecutor ...
 
 Because it runs on JDK 1.3?
 
 The java.util.concurrent backport
 http://backport-jsr166.sourceforge.net/ runs on 1.3, for just this
 kind of use. 

I know this, but I doubt that we wanna start to depend Apache common components 
on it ;-)

- Jörg

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: codec - thread safe

2007-10-08 Thread David J. Biesack
 Date: Mon, 8 Oct 2007 17:01:02 +0200
 From: =?iso-8859-1?Q?J=F6rg_Schaible?= [EMAIL PROTECTED]
 
  The java.util.concurrent backport
  http://backport-jsr166.sourceforge.net/ runs on 1.3, for just this
  kind of use. 
 
 I know this, but I doubt that we wanna start to depend Apache common 
 components on it ;-)

Agreed; I, like you, was merely suggesting to what Commons consumers might do 
to deal with non-threadsafe codecs.

 - Jörg

-- 
David J. Biesack SAS Institute Inc.
(919) 531-7771   SAS Campus Drive
http://www.sas.com   Cary, NC 27513


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: codec - thread safe

2007-10-08 Thread Will Pugh

David J. Biesack wrote:

Date: Mon, 8 Oct 2007 17:01:02 +0200
From: =?iso-8859-1?Q?J=F6rg_Schaible?= [EMAIL PROTECTED]



The java.util.concurrent backport
http://backport-jsr166.sourceforge.net/ runs on 1.3, for just this
kind of use. 
  

I know this, but I doubt that we wanna start to depend Apache common components 
on it ;-)



Agreed; I, like you, was merely suggesting to what Commons consumers might do 
to deal with non-threadsafe codecs.

  
Might seem like a silly question, but has anyone found anything that 
looks like a threading issue in Codec? 

The code is pretty simple.  I did a quick look through the code, and it 
seemed like everything was pretty thread safe.  It looks like you should 
be O.K. as long as:
   1)  You don't change the parameters to a codec while it's running, 
e.g. changing charsets, etc.
   2)  You should not use a MessageDigest returned by DigestUtils by 
multiple threads.  This should be clear given the API (it's a SUN api)


It also seems to me that the difference between code that you can trust 
as being thread safe, and code you cannot trust as being thread safe has 
a lot to do with whether it is tested for thread safety.  Seems to me 
like another way of helping out here would be to build concurrency tests 
for whichever APIs you are interested using on multiple threads.  Not 
sure where we would put them yet, but seems like the only way to assure 
thread-safety (especially if we could get someone to volunteer running 
them on a big beefy multi-proc machine :).

- Jörg



  


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: codec - thread safe

2007-10-08 Thread sebb
On 08/10/2007, Will Pugh [EMAIL PROTECTED] wrote:
 David J. Biesack wrote:
  Date: Mon, 8 Oct 2007 17:01:02 +0200
  From: =?iso-8859-1?Q?J=F6rg_Schaible?= [EMAIL PROTECTED]
 
 
  The java.util.concurrent backport
  http://backport-jsr166.sourceforge.net/ runs on 1.3, for just this
  kind of use.
 
  I know this, but I doubt that we wanna start to depend Apache common 
  components on it ;-)
 
 
  Agreed; I, like you, was merely suggesting to what Commons consumers might 
  do to deal with non-threadsafe codecs.
 
 
 Might seem like a silly question, but has anyone found anything that
 looks like a threading issue in Codec?

Yes -  QCodec has a public method that sets an instance variable (see
my comments on CODEC-55).

 The code is pretty simple.  I did a quick look through the code, and it
 seemed like everything was pretty thread safe.  It looks like you should
 be O.K. as long as:
1)  You don't change the parameters to a codec while it's running,
 e.g. changing charsets, etc.

See above.

Having said that, I think it would not be too difficult to fix the
classes so that they are thread-safe. Most of the ones I looked at
could even be made invariant.

So their thread-safety would only depend on any external calls they
made; this could presumably be fixed by providing synchronised
versions as suggested earlier.

2)  You should not use a MessageDigest returned by DigestUtils by
 multiple threads.  This should be clear given the API (it's a SUN api)

 It also seems to me that the difference between code that you can trust
 as being thread safe, and code you cannot trust as being thread safe has
 a lot to do with whether it is tested for thread safety.  Seems to me
 like another way of helping out here would be to build concurrency tests
 for whichever APIs you are interested using on multiple threads.  Not
 sure where we would put them yet, but seems like the only way to assure
 thread-safety (especially if we could get someone to volunteer running
 them on a big beefy multi-proc machine :).

Indeed.

  - Jörg
 
 
 

 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: codec - thread safe

2007-10-08 Thread Will Pugh
Ya know.  Just to put this in a little perspective. 

These objects are pretty small (QCodec has only two members in it), and 
have practically no instantiation cost.  Instantiating these and quickly 
throwing them out is probably not such a bad idea, and may give you 
better performance that doing anything with ThreadLocal.  Java VMs are 
fairly well tuned for dealing with short-lived, small objects (which 
these are).


I can't imagine that many people are using thousands of these per 
request, probably, more on the order of one or two a request?


   --Will

James Carman wrote:

Try using a ThreadLocal variable to store your Encoder/Decoder that
you need to be thread safe.

On 10/8/07, sebb [EMAIL PROTECTED] wrote:
  

On 08/10/2007, Will Pugh [EMAIL PROTECTED] wrote:


David J. Biesack wrote:
  

Date: Mon, 8 Oct 2007 17:01:02 +0200
From: =?iso-8859-1?Q?J=F6rg_Schaible?= [EMAIL PROTECTED]


  

The java.util.concurrent backport
http://backport-jsr166.sourceforge.net/ runs on 1.3, for just this
kind of use.



I know this, but I doubt that we wanna start to depend Apache common components 
on it ;-)

  

Agreed; I, like you, was merely suggesting to what Commons consumers might do 
to deal with non-threadsafe codecs.




Might seem like a silly question, but has anyone found anything that
looks like a threading issue in Codec?
  

Yes -  QCodec has a public method that sets an instance variable (see
my comments on CODEC-55).



The code is pretty simple.  I did a quick look through the code, and it
seemed like everything was pretty thread safe.  It looks like you should
be O.K. as long as:
   1)  You don't change the parameters to a codec while it's running,
e.g. changing charsets, etc.
  

See above.

Having said that, I think it would not be too difficult to fix the
classes so that they are thread-safe. Most of the ones I looked at
could even be made invariant.

So their thread-safety would only depend on any external calls they
made; this could presumably be fixed by providing synchronised
versions as suggested earlier.



   2)  You should not use a MessageDigest returned by DigestUtils by
multiple threads.  This should be clear given the API (it's a SUN api)

It also seems to me that the difference between code that you can trust
as being thread safe, and code you cannot trust as being thread safe has
a lot to do with whether it is tested for thread safety.  Seems to me
like another way of helping out here would be to build concurrency tests
for whichever APIs you are interested using on multiple threads.  Not
sure where we would put them yet, but seems like the only way to assure
thread-safety (especially if we could get someone to volunteer running
them on a big beefy multi-proc machine :).
  

Indeed.



- Jörg

  


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


  

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

  


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: codec - thread safe

2007-10-08 Thread James Carman
That's kind of what I said on the JIRA issue, too.

On 10/8/07, Will Pugh [EMAIL PROTECTED] wrote:
 Ya know.  Just to put this in a little perspective.

 These objects are pretty small (QCodec has only two members in it), and
 have practically no instantiation cost.  Instantiating these and quickly
 throwing them out is probably not such a bad idea, and may give you
 better performance that doing anything with ThreadLocal.  Java VMs are
 fairly well tuned for dealing with short-lived, small objects (which
 these are).

 I can't imagine that many people are using thousands of these per
 request, probably, more on the order of one or two a request?

 --Will

 James Carman wrote:
  Try using a ThreadLocal variable to store your Encoder/Decoder that
  you need to be thread safe.
 
  On 10/8/07, sebb [EMAIL PROTECTED] wrote:
 
  On 08/10/2007, Will Pugh [EMAIL PROTECTED] wrote:
 
  David J. Biesack wrote:
 
  Date: Mon, 8 Oct 2007 17:01:02 +0200
  From: =?iso-8859-1?Q?J=F6rg_Schaible?= [EMAIL PROTECTED]
 
 
 
  The java.util.concurrent backport
  http://backport-jsr166.sourceforge.net/ runs on 1.3, for just this
  kind of use.
 
 
  I know this, but I doubt that we wanna start to depend Apache common 
  components on it ;-)
 
 
  Agreed; I, like you, was merely suggesting to what Commons consumers 
  might do to deal with non-threadsafe codecs.
 
 
 
  Might seem like a silly question, but has anyone found anything that
  looks like a threading issue in Codec?
 
  Yes -  QCodec has a public method that sets an instance variable (see
  my comments on CODEC-55).
 
 
  The code is pretty simple.  I did a quick look through the code, and it
  seemed like everything was pretty thread safe.  It looks like you should
  be O.K. as long as:
 1)  You don't change the parameters to a codec while it's running,
  e.g. changing charsets, etc.
 
  See above.
 
  Having said that, I think it would not be too difficult to fix the
  classes so that they are thread-safe. Most of the ones I looked at
  could even be made invariant.
 
  So their thread-safety would only depend on any external calls they
  made; this could presumably be fixed by providing synchronised
  versions as suggested earlier.
 
 
 2)  You should not use a MessageDigest returned by DigestUtils by
  multiple threads.  This should be clear given the API (it's a SUN api)
 
  It also seems to me that the difference between code that you can trust
  as being thread safe, and code you cannot trust as being thread safe has
  a lot to do with whether it is tested for thread safety.  Seems to me
  like another way of helping out here would be to build concurrency tests
  for whichever APIs you are interested using on multiple threads.  Not
  sure where we would put them yet, but seems like the only way to assure
  thread-safety (especially if we could get someone to volunteer running
  them on a big beefy multi-proc machine :).
 
  Indeed.
 
 
  - Jörg
 
 
 
  -
  To unsubscribe, e-mail: [EMAIL PROTECTED]
  For additional commands, e-mail: [EMAIL PROTECTED]
 
 
 
  -
  To unsubscribe, e-mail: [EMAIL PROTECTED]
  For additional commands, e-mail: [EMAIL PROTECTED]
 
 
 
 
  -
  To unsubscribe, e-mail: [EMAIL PROTECTED]
  For additional commands, e-mail: [EMAIL PROTECTED]
 
 

 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: codec - thread safe

2007-10-08 Thread Qingtian Wang
The SLA of the project I am working on is 2000 transactions per
second. And I need to decode a 1K string on each request.

-Q



On 10/8/07, Will Pugh [EMAIL PROTECTED] wrote:
 Ya know.  Just to put this in a little perspective.

 These objects are pretty small (QCodec has only two members in it), and
 have practically no instantiation cost.  Instantiating these and quickly
 throwing them out is probably not such a bad idea, and may give you
 better performance that doing anything with ThreadLocal.  Java VMs are
 fairly well tuned for dealing with short-lived, small objects (which
 these are).

 I can't imagine that many people are using thousands of these per
 request, probably, more on the order of one or two a request?

 --Will

 James Carman wrote:
  Try using a ThreadLocal variable to store your Encoder/Decoder that
  you need to be thread safe.
 
  On 10/8/07, sebb [EMAIL PROTECTED] wrote:
 
  On 08/10/2007, Will Pugh [EMAIL PROTECTED] wrote:
 
  David J. Biesack wrote:
 
  Date: Mon, 8 Oct 2007 17:01:02 +0200
  From: =?iso-8859-1?Q?J=F6rg_Schaible?= [EMAIL PROTECTED]
 
 
 
  The java.util.concurrent backport
  http://backport-jsr166.sourceforge.net/ runs on 1.3, for just this
  kind of use.
 
 
  I know this, but I doubt that we wanna start to depend Apache common 
  components on it ;-)
 
 
  Agreed; I, like you, was merely suggesting to what Commons consumers 
  might do to deal with non-threadsafe codecs.
 
 
 
  Might seem like a silly question, but has anyone found anything that
  looks like a threading issue in Codec?
 
  Yes -  QCodec has a public method that sets an instance variable (see
  my comments on CODEC-55).
 
 
  The code is pretty simple.  I did a quick look through the code, and it
  seemed like everything was pretty thread safe.  It looks like you should
  be O.K. as long as:
 1)  You don't change the parameters to a codec while it's running,
  e.g. changing charsets, etc.
 
  See above.
 
  Having said that, I think it would not be too difficult to fix the
  classes so that they are thread-safe. Most of the ones I looked at
  could even be made invariant.
 
  So their thread-safety would only depend on any external calls they
  made; this could presumably be fixed by providing synchronised
  versions as suggested earlier.
 
 
 2)  You should not use a MessageDigest returned by DigestUtils by
  multiple threads.  This should be clear given the API (it's a SUN api)
 
  It also seems to me that the difference between code that you can trust
  as being thread safe, and code you cannot trust as being thread safe has
  a lot to do with whether it is tested for thread safety.  Seems to me
  like another way of helping out here would be to build concurrency tests
  for whichever APIs you are interested using on multiple threads.  Not
  sure where we would put them yet, but seems like the only way to assure
  thread-safety (especially if we could get someone to volunteer running
  them on a big beefy multi-proc machine :).
 
  Indeed.
 
 
  - Jörg
 
 
 
  -
  To unsubscribe, e-mail: [EMAIL PROTECTED]
  For additional commands, e-mail: [EMAIL PROTECTED]
 
 
 
  -
  To unsubscribe, e-mail: [EMAIL PROTECTED]
  For additional commands, e-mail: [EMAIL PROTECTED]
 
 
 
 
  -
  To unsubscribe, e-mail: [EMAIL PROTECTED]
  For additional commands, e-mail: [EMAIL PROTECTED]
 
 

 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: codec - thread safe

2007-10-08 Thread sebb
Which methods do you actually need?

If you only need BCodec, then that (and Base64 which it calls) look to
be thread-safe, so you only need to instantiate it once for each
different charset.

On 08/10/2007, Qingtian Wang [EMAIL PROTECTED] wrote:
 The SLA of the project I am working on is 2000 transactions per
 second. And I need to decode a 1K string on each request.

 -Q



 On 10/8/07, Will Pugh [EMAIL PROTECTED] wrote:
  Ya know.  Just to put this in a little perspective.
 
  These objects are pretty small (QCodec has only two members in it), and
  have practically no instantiation cost.  Instantiating these and quickly
  throwing them out is probably not such a bad idea, and may give you
  better performance that doing anything with ThreadLocal.  Java VMs are
  fairly well tuned for dealing with short-lived, small objects (which
  these are).
 
  I can't imagine that many people are using thousands of these per
  request, probably, more on the order of one or two a request?
 
  --Will
 
  James Carman wrote:
   Try using a ThreadLocal variable to store your Encoder/Decoder that
   you need to be thread safe.
  
   On 10/8/07, sebb [EMAIL PROTECTED] wrote:
  
   On 08/10/2007, Will Pugh [EMAIL PROTECTED] wrote:
  
   David J. Biesack wrote:
  
   Date: Mon, 8 Oct 2007 17:01:02 +0200
   From: =?iso-8859-1?Q?J=F6rg_Schaible?= [EMAIL PROTECTED]
  
  
  
   The java.util.concurrent backport
   http://backport-jsr166.sourceforge.net/ runs on 1.3, for just this
   kind of use.
  
  
   I know this, but I doubt that we wanna start to depend Apache common 
   components on it ;-)
  
  
   Agreed; I, like you, was merely suggesting to what Commons consumers 
   might do to deal with non-threadsafe codecs.
  
  
  
   Might seem like a silly question, but has anyone found anything that
   looks like a threading issue in Codec?
  
   Yes -  QCodec has a public method that sets an instance variable (see
   my comments on CODEC-55).
  
  
   The code is pretty simple.  I did a quick look through the code, and it
   seemed like everything was pretty thread safe.  It looks like you should
   be O.K. as long as:
  1)  You don't change the parameters to a codec while it's running,
   e.g. changing charsets, etc.
  
   See above.
  
   Having said that, I think it would not be too difficult to fix the
   classes so that they are thread-safe. Most of the ones I looked at
   could even be made invariant.
  
   So their thread-safety would only depend on any external calls they
   made; this could presumably be fixed by providing synchronised
   versions as suggested earlier.
  
  
  2)  You should not use a MessageDigest returned by DigestUtils by
   multiple threads.  This should be clear given the API (it's a SUN api)
  
   It also seems to me that the difference between code that you can trust
   as being thread safe, and code you cannot trust as being thread safe has
   a lot to do with whether it is tested for thread safety.  Seems to me
   like another way of helping out here would be to build concurrency tests
   for whichever APIs you are interested using on multiple threads.  Not
   sure where we would put them yet, but seems like the only way to assure
   thread-safety (especially if we could get someone to volunteer running
   them on a big beefy multi-proc machine :).
  
   Indeed.
  
  
   - Jörg
  
  
  
   -
   To unsubscribe, e-mail: [EMAIL PROTECTED]
   For additional commands, e-mail: [EMAIL PROTECTED]
  
  
  
   -
   To unsubscribe, e-mail: [EMAIL PROTECTED]
   For additional commands, e-mail: [EMAIL PROTECTED]
  
  
  
  
   -
   To unsubscribe, e-mail: [EMAIL PROTECTED]
   For additional commands, e-mail: [EMAIL PROTECTED]
  
  
 
  -
  To unsubscribe, e-mail: [EMAIL PROTECTED]
  For additional commands, e-mail: [EMAIL PROTECTED]
 
 

 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: codec - thread safe

2007-10-08 Thread James Carman
And, a simple map would suffice in that case (lazy-initialized of
course).  If you need QCodec, then you'd have in incorporate the
encodeBlanks and the charset into the map's key (if you really have
cases where you do/do not want to encode blanks).

On 10/8/07, sebb [EMAIL PROTECTED] wrote:
 Which methods do you actually need?

 If you only need BCodec, then that (and Base64 which it calls) look to
 be thread-safe, so you only need to instantiate it once for each
 different charset.

 On 08/10/2007, Qingtian Wang [EMAIL PROTECTED] wrote:
  The SLA of the project I am working on is 2000 transactions per
  second. And I need to decode a 1K string on each request.
 
  -Q
 
 
 
  On 10/8/07, Will Pugh [EMAIL PROTECTED] wrote:
   Ya know.  Just to put this in a little perspective.
  
   These objects are pretty small (QCodec has only two members in it), and
   have practically no instantiation cost.  Instantiating these and quickly
   throwing them out is probably not such a bad idea, and may give you
   better performance that doing anything with ThreadLocal.  Java VMs are
   fairly well tuned for dealing with short-lived, small objects (which
   these are).
  
   I can't imagine that many people are using thousands of these per
   request, probably, more on the order of one or two a request?
  
   --Will
  
   James Carman wrote:
Try using a ThreadLocal variable to store your Encoder/Decoder that
you need to be thread safe.
   
On 10/8/07, sebb [EMAIL PROTECTED] wrote:
   
On 08/10/2007, Will Pugh [EMAIL PROTECTED] wrote:
   
David J. Biesack wrote:
   
Date: Mon, 8 Oct 2007 17:01:02 +0200
From: =?iso-8859-1?Q?J=F6rg_Schaible?= [EMAIL PROTECTED]
   
   
   
The java.util.concurrent backport
http://backport-jsr166.sourceforge.net/ runs on 1.3, for just this
kind of use.
   
   
I know this, but I doubt that we wanna start to depend Apache 
common components on it ;-)
   
   
Agreed; I, like you, was merely suggesting to what Commons consumers 
might do to deal with non-threadsafe codecs.
   
   
   
Might seem like a silly question, but has anyone found anything that
looks like a threading issue in Codec?
   
Yes -  QCodec has a public method that sets an instance variable (see
my comments on CODEC-55).
   
   
The code is pretty simple.  I did a quick look through the code, and 
it
seemed like everything was pretty thread safe.  It looks like you 
should
be O.K. as long as:
   1)  You don't change the parameters to a codec while it's running,
e.g. changing charsets, etc.
   
See above.
   
Having said that, I think it would not be too difficult to fix the
classes so that they are thread-safe. Most of the ones I looked at
could even be made invariant.
   
So their thread-safety would only depend on any external calls they
made; this could presumably be fixed by providing synchronised
versions as suggested earlier.
   
   
   2)  You should not use a MessageDigest returned by DigestUtils by
multiple threads.  This should be clear given the API (it's a SUN api)
   
It also seems to me that the difference between code that you can 
trust
as being thread safe, and code you cannot trust as being thread safe 
has
a lot to do with whether it is tested for thread safety.  Seems to me
like another way of helping out here would be to build concurrency 
tests
for whichever APIs you are interested using on multiple threads.  Not
sure where we would put them yet, but seems like the only way to 
assure
thread-safety (especially if we could get someone to volunteer running
them on a big beefy multi-proc machine :).
   
Indeed.
   
   
- Jörg
   
   
   
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
   
   
   
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
   
   
   
   
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
   
   
  
   -
   To unsubscribe, e-mail: [EMAIL PROTECTED]
   For additional commands, e-mail: [EMAIL PROTECTED]
  
  
 
  -
  To unsubscribe, e-mail: [EMAIL PROTECTED]
  For additional commands, e-mail: [EMAIL PROTECTED]
 
 

 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL 

Re: codec - thread safe

2007-10-08 Thread sebb
Or take a copy of QCodec and fix it to remove the offending code...

The change would be more difficult to make in the codec library,
because removing the offending method would not be backwards
compatible.

On 09/10/2007, James Carman [EMAIL PROTECTED] wrote:
 And, a simple map would suffice in that case (lazy-initialized of
 course).  If you need QCodec, then you'd have in incorporate the
 encodeBlanks and the charset into the map's key (if you really have
 cases where you do/do not want to encode blanks).

 On 10/8/07, sebb [EMAIL PROTECTED] wrote:
  Which methods do you actually need?
 
  If you only need BCodec, then that (and Base64 which it calls) look to
  be thread-safe, so you only need to instantiate it once for each
  different charset.
 
  On 08/10/2007, Qingtian Wang [EMAIL PROTECTED] wrote:
   The SLA of the project I am working on is 2000 transactions per
   second. And I need to decode a 1K string on each request.
  
   -Q
  
  
  
   On 10/8/07, Will Pugh [EMAIL PROTECTED] wrote:
Ya know.  Just to put this in a little perspective.
   
These objects are pretty small (QCodec has only two members in it), and
have practically no instantiation cost.  Instantiating these and quickly
throwing them out is probably not such a bad idea, and may give you
better performance that doing anything with ThreadLocal.  Java VMs are
fairly well tuned for dealing with short-lived, small objects (which
these are).
   
I can't imagine that many people are using thousands of these per
request, probably, more on the order of one or two a request?
   
--Will
   
James Carman wrote:
 Try using a ThreadLocal variable to store your Encoder/Decoder that
 you need to be thread safe.

 On 10/8/07, sebb [EMAIL PROTECTED] wrote:

 On 08/10/2007, Will Pugh [EMAIL PROTECTED] wrote:

 David J. Biesack wrote:

 Date: Mon, 8 Oct 2007 17:01:02 +0200
 From: =?iso-8859-1?Q?J=F6rg_Schaible?= [EMAIL PROTECTED]



 The java.util.concurrent backport
 http://backport-jsr166.sourceforge.net/ runs on 1.3, for just 
 this
 kind of use.


 I know this, but I doubt that we wanna start to depend Apache 
 common components on it ;-)


 Agreed; I, like you, was merely suggesting to what Commons 
 consumers might do to deal with non-threadsafe codecs.



 Might seem like a silly question, but has anyone found anything that
 looks like a threading issue in Codec?

 Yes -  QCodec has a public method that sets an instance variable (see
 my comments on CODEC-55).


 The code is pretty simple.  I did a quick look through the code, 
 and it
 seemed like everything was pretty thread safe.  It looks like you 
 should
 be O.K. as long as:
1)  You don't change the parameters to a codec while it's 
 running,
 e.g. changing charsets, etc.

 See above.

 Having said that, I think it would not be too difficult to fix the
 classes so that they are thread-safe. Most of the ones I looked at
 could even be made invariant.

 So their thread-safety would only depend on any external calls they
 made; this could presumably be fixed by providing synchronised
 versions as suggested earlier.


2)  You should not use a MessageDigest returned by DigestUtils by
 multiple threads.  This should be clear given the API (it's a SUN 
 api)

 It also seems to me that the difference between code that you can 
 trust
 as being thread safe, and code you cannot trust as being thread 
 safe has
 a lot to do with whether it is tested for thread safety.  Seems to 
 me
 like another way of helping out here would be to build concurrency 
 tests
 for whichever APIs you are interested using on multiple threads.  
 Not
 sure where we would put them yet, but seems like the only way to 
 assure
 thread-safety (especially if we could get someone to volunteer 
 running
 them on a big beefy multi-proc machine :).

 Indeed.


 - Jörg



 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]



 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]




 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]


   
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
   
   
  
   

Re: codec - thread safe

2007-10-08 Thread Qingtian Wang
On 10/8/07, sebb [EMAIL PROTECTED] wrote:
 Which methods do you actually need?

 If you only need BCodec, then that (and Base64 which it calls) look to
 be thread-safe, so you only need to instantiate it once for each
 different charset.

Yes, I sort of figured that out for myself already when I first
started the conversation with Henri. Now thanks for your help, I got
another pair of eyes confirming this. So I am good in my case.

But all the discussion has been more about the in general case. I
just feel that,

1. Codec as a commons library, it should not be this hard to find out
about information like this. It should be either this or that, find
out for yourself is no good situation. As someone else pointed out
earlier, we could use a better documentation.

2. It'd be nice for the biz method implementations to be thread safe
(Ideally in a high performance manner as a value add-on of using a
commons brand library such that user doesn't have to be too creative
as some of the suggestions given in this discussion to achieve
performance). Most of them may already be thread safe. And as it seems
agreed by all that it's not hard to make them that way if not.

Thanks to all for your help!
-Q



 On 08/10/2007, Qingtian Wang [EMAIL PROTECTED] wrote:
  The SLA of the project I am working on is 2000 transactions per
  second. And I need to decode a 1K string on each request.
 
  -Q
 
 
 
  On 10/8/07, Will Pugh [EMAIL PROTECTED] wrote:
   Ya know.  Just to put this in a little perspective.
  
   These objects are pretty small (QCodec has only two members in it), and
   have practically no instantiation cost.  Instantiating these and quickly
   throwing them out is probably not such a bad idea, and may give you
   better performance that doing anything with ThreadLocal.  Java VMs are
   fairly well tuned for dealing with short-lived, small objects (which
   these are).
  
   I can't imagine that many people are using thousands of these per
   request, probably, more on the order of one or two a request?
  
   --Will
  
   James Carman wrote:
Try using a ThreadLocal variable to store your Encoder/Decoder that
you need to be thread safe.
   
On 10/8/07, sebb [EMAIL PROTECTED] wrote:
   
On 08/10/2007, Will Pugh [EMAIL PROTECTED] wrote:
   
David J. Biesack wrote:
   
Date: Mon, 8 Oct 2007 17:01:02 +0200
From: =?iso-8859-1?Q?J=F6rg_Schaible?= [EMAIL PROTECTED]
   
   
   
The java.util.concurrent backport
http://backport-jsr166.sourceforge.net/ runs on 1.3, for just this
kind of use.
   
   
I know this, but I doubt that we wanna start to depend Apache 
common components on it ;-)
   
   
Agreed; I, like you, was merely suggesting to what Commons consumers 
might do to deal with non-threadsafe codecs.
   
   
   
Might seem like a silly question, but has anyone found anything that
looks like a threading issue in Codec?
   
Yes -  QCodec has a public method that sets an instance variable (see
my comments on CODEC-55).
   
   
The code is pretty simple.  I did a quick look through the code, and 
it
seemed like everything was pretty thread safe.  It looks like you 
should
be O.K. as long as:
   1)  You don't change the parameters to a codec while it's running,
e.g. changing charsets, etc.
   
See above.
   
Having said that, I think it would not be too difficult to fix the
classes so that they are thread-safe. Most of the ones I looked at
could even be made invariant.
   
So their thread-safety would only depend on any external calls they
made; this could presumably be fixed by providing synchronised
versions as suggested earlier.
   
   
   2)  You should not use a MessageDigest returned by DigestUtils by
multiple threads.  This should be clear given the API (it's a SUN api)
   
It also seems to me that the difference between code that you can 
trust
as being thread safe, and code you cannot trust as being thread safe 
has
a lot to do with whether it is tested for thread safety.  Seems to me
like another way of helping out here would be to build concurrency 
tests
for whichever APIs you are interested using on multiple threads.  Not
sure where we would put them yet, but seems like the only way to 
assure
thread-safety (especially if we could get someone to volunteer running
them on a big beefy multi-proc machine :).
   
Indeed.
   
   
- Jörg
   
   
   
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
   
   
   
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
   
   
   
   
-
To unsubscribe, 

Re: codec - thread safe

2007-10-07 Thread sebb
On 07/10/2007, simon [EMAIL PROTECTED] wrote:
 On Sat, 2007-10-06 at 23:31 -0500, Qingtian Wang wrote:
  Well, it's pick-your-poison kind of a deal. Either block on one
  instance and take a performance hit, or burn up the memory with lots
  of instances.
 
  But in the case of BCodec, I think encode/decode is thread safe.
  Unfortunately per Henri, that's not generally true for others.
 
  Well, let me make it clear that I am a total layman on codec. But it
  seems to me it's not that difficult to implement all the codec methods
  in a thread safe manner, without sync blocks.
 
  Can the dev team make that happen? - a humble request from a user.

 The think about open source is that there is no distinction between
 developer and user. The developers develop because they want to use the
 code. And when somebody wants to use a feature that doesn't exist, then
 they can develop it.

 In short, if you want this feature, you can do this yourself and post a
 patch to this list so it gets included in the next release. There is no
 paid dev team for any of the commons code.

Might be better to create a JIRA issue and attach the patch to that:
* others can easily see that the issue has been reported
* patches can get mangled when posted to lists, and are more difficult
to keep track of.

For Codec, start here:
http://commons.apache.org/codec/issue-tracking.html

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: codec - thread safe

2007-10-07 Thread Qingtian Wang
On 10/7/07, simon [EMAIL PROTECTED] wrote:
 On Sat, 2007-10-06 at 23:31 -0500, Qingtian Wang wrote:
  Well, it's pick-your-poison kind of a deal. Either block on one
  instance and take a performance hit, or burn up the memory with lots
  of instances.
 
  But in the case of BCodec, I think encode/decode is thread safe.
  Unfortunately per Henri, that's not generally true for others.
 
  Well, let me make it clear that I am a total layman on codec. But it
  seems to me it's not that difficult to implement all the codec methods
  in a thread safe manner, without sync blocks.
 
  Can the dev team make that happen? - a humble request from a user.

 The think about open source is that there is no distinction between
 developer and user.

That's interesting. Why then almost all open source projects have
users doc vs. developer doc, users mailing list vs. developers
mailing list?

 The developers develop because they want to use the
 code. And when somebody wants to use a feature that doesn't exist, then
 they can develop it.

 In short, if you want this feature, you can do this yourself and post a
 patch to this list so it gets included in the next release. There is no
 paid dev team for any of the commons code.

OK i see. My intention was just to bring up an issue/request. There is
no saying the developers are obligated to work on it. And contributing
code is certainly not the only way of involving with an open source
project. Nevertheless, if the dev team (unpaid as I understand it)
wants the project to be more popular, it might help listening to what
other users have to say other than themselves; otherwise just forget
about it.

-Q







 Regards,

 Simon


 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: codec - thread safe

2007-10-07 Thread Torsten Curdt

Can the dev team make that happen? - a humble request from a user.


The think about open source is that there is no distinction between
developer and user.


That's interesting. Why then almost all open source projects have
users doc vs. developer doc, users mailing list vs. developers
mailing list?


Point taken. But this is more about the presentation of information.  
Still the distinction is blurry as many developers and contributors  
come initially from the user background. They felt the itch to  
improve the code and ended up providing the fix/feature  
themselves ...I guess we are trying to get you to do the same :) You  
should see that more as an invitation to help.



The developers develop because they want to use the
code. And when somebody wants to use a feature that doesn't exist,  
then

they can develop it.

In short, if you want this feature, you can do this yourself and  
post a
patch to this list so it gets included in the next release. There  
is no

paid dev team for any of the commons code.


OK i see. My intention was just to bring up an issue/request. There is
no saying the developers are obligated to work on it. And contributing
code is certainly not the only way of involving with an open source
project.


That is of course true. But I guess what Simon was wanting to point  
out is that requesting a feature rarely means it gets done unless a  
developer needs that feature too. The personal TODO lists are usually  
very long. So usually you are better of contributing a patch  
yourself ...or pay someone to do it for you. That's how it works.



Nevertheless, if the dev team (unpaid as I understand it)
wants the project to be more popular, it might help listening to what
other users have to say other than themselves; otherwise just forget
about it.


Well, sure ...listening is great - but someone gotta do it.
There is no us vs you, no dev vs user there is just us.

Does that make sense?

cheers
--
Torsten

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: codec - thread safe

2007-10-07 Thread Qingtian Wang
Ok, I got the point.

So let's say I wanted to work on this. What's the most effective way to do it?

Search the entire code base line by line trying to ID all the thread
unsafe points by myself? I guess that's very ineffective compared to
have an issue open, and have the individual developers who write the
code to address it - They know what needs to be tweaked without having
to even spend any time since it's their own code. Or at least, the dev
team as a whole can come up a list of points that need to be worked
on. I think that'd be much more effective. Any established channel
where that can happen?

Thanks,
-Q



On 10/7/07, Torsten Curdt [EMAIL PROTECTED] wrote:
  Can the dev team make that happen? - a humble request from a user.
 
  The think about open source is that there is no distinction between
  developer and user.
 
  That's interesting. Why then almost all open source projects have
  users doc vs. developer doc, users mailing list vs. developers
  mailing list?

 Point taken. But this is more about the presentation of information.
 Still the distinction is blurry as many developers and contributors
 come initially from the user background. They felt the itch to
 improve the code and ended up providing the fix/feature
 themselves ...I guess we are trying to get you to do the same :) You
 should see that more as an invitation to help.

  The developers develop because they want to use the
  code. And when somebody wants to use a feature that doesn't exist,
  then
  they can develop it.
 
  In short, if you want this feature, you can do this yourself and
  post a
  patch to this list so it gets included in the next release. There
  is no
  paid dev team for any of the commons code.
 
  OK i see. My intention was just to bring up an issue/request. There is
  no saying the developers are obligated to work on it. And contributing
  code is certainly not the only way of involving with an open source
  project.

 That is of course true. But I guess what Simon was wanting to point
 out is that requesting a feature rarely means it gets done unless a
 developer needs that feature too. The personal TODO lists are usually
 very long. So usually you are better of contributing a patch
 yourself ...or pay someone to do it for you. That's how it works.

  Nevertheless, if the dev team (unpaid as I understand it)
  wants the project to be more popular, it might help listening to what
  other users have to say other than themselves; otherwise just forget
  about it.

 Well, sure ...listening is great - but someone gotta do it.
 There is no us vs you, no dev vs user there is just us.

 Does that make sense?

 cheers
 --
 Torsten

 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: codec - thread safe

2007-10-07 Thread sebb
On 07/10/2007, Qingtian Wang [EMAIL PROTECTED] wrote:
 Ok, I got the point.

 So let's say I wanted to work on this. What's the most effective way to do it?

You could try running some of the code checking utilities on the library.

For example Findbugs and PMD.

I've done a quick check with FIndbugs, but that has not found any
synchronisation issues. This is because there are no synchronised
methods, so it cannot find any inconsistencies.

But it should be possible to configure PMD to find any instance or
static variables that are modified after construction.

Or indeed you could perhaps try making all such variables final, and
see whether it's possible to fix the compiler errors - but that might
be a lot of work.

 Search the entire code base line by line trying to ID all the thread
 unsafe points by myself? I guess that's very ineffective compared to
 have an issue open, and have the individual developers who write the
 code to address it - They know what needs to be tweaked without having
 to even spend any time since it's their own code. Or at least, the dev
 team as a whole can come up a list of points that need to be worked

Not necessarily - the code may have been developed over a long period,
and the original authors may not longer be around. Unless the code has
been worked on recently, it's possible that no-one has much detailed
knowledge of the code.

 on. I think that'd be much more effective. Any established channel
 where that can happen?

If you find some bugs, then report them via JIRA.
It helps if you can provide test cases; obviously patches as well is
even better.

 Thanks,
 -Q



 On 10/7/07, Torsten Curdt [EMAIL PROTECTED] wrote:
   Can the dev team make that happen? - a humble request from a user.
  
   The think about open source is that there is no distinction between
   developer and user.
  
   That's interesting. Why then almost all open source projects have
   users doc vs. developer doc, users mailing list vs. developers
   mailing list?
 
  Point taken. But this is more about the presentation of information.
  Still the distinction is blurry as many developers and contributors
  come initially from the user background. They felt the itch to
  improve the code and ended up providing the fix/feature
  themselves ...I guess we are trying to get you to do the same :) You
  should see that more as an invitation to help.
 
   The developers develop because they want to use the
   code. And when somebody wants to use a feature that doesn't exist,
   then
   they can develop it.
  
   In short, if you want this feature, you can do this yourself and
   post a
   patch to this list so it gets included in the next release. There
   is no
   paid dev team for any of the commons code.
  
   OK i see. My intention was just to bring up an issue/request. There is
   no saying the developers are obligated to work on it. And contributing
   code is certainly not the only way of involving with an open source
   project.
 
  That is of course true. But I guess what Simon was wanting to point
  out is that requesting a feature rarely means it gets done unless a
  developer needs that feature too. The personal TODO lists are usually
  very long. So usually you are better of contributing a patch
  yourself ...or pay someone to do it for you. That's how it works.
 
   Nevertheless, if the dev team (unpaid as I understand it)
   wants the project to be more popular, it might help listening to what
   other users have to say other than themselves; otherwise just forget
   about it.
 
  Well, sure ...listening is great - but someone gotta do it.
  There is no us vs you, no dev vs user there is just us.
 
  Does that make sense?
 
  cheers
  --
  Torsten
 
  -
  To unsubscribe, e-mail: [EMAIL PROTECTED]
  For additional commands, e-mail: [EMAIL PROTECTED]
 
 

 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: codec - thread safe

2007-10-07 Thread Qingtian Wang
On 10/7/07, sebb [EMAIL PROTECTED] wrote:
 On 07/10/2007, Qingtian Wang [EMAIL PROTECTED] wrote:
  Ok, I got the point.
 
  So let's say I wanted to work on this. What's the most effective way to do 
  it?

 You could try running some of the code checking utilities on the library.

 For example Findbugs and PMD.

 I've done a quick check with FIndbugs, but that has not found any
 synchronisation issues. This is because there are no synchronised
 methods, so it cannot find any inconsistencies.

 But it should be possible to configure PMD to find any instance or
 static variables that are modified after construction.

 Or indeed you could perhaps try making all such variables final, and
 see whether it's possible to fix the compiler errors - but that might
 be a lot of work.

  Search the entire code base line by line trying to ID all the thread
  unsafe points by myself? I guess that's very ineffective compared to
  have an issue open, and have the individual developers who write the
  code to address it - They know what needs to be tweaked without having
  to even spend any time since it's their own code. Or at least, the dev
  team as a whole can come up a list of points that need to be worked

 Not necessarily - the code may have been developed over a long period,
 and the original authors may not longer be around. Unless the code has
 been worked on recently, it's possible that no-one has much detailed
 knowledge of the code.

Wow, if that is really the case with commons codec that no one has
detailed knowledge of the code, it would suggest to me that no one
should use it as a off the shell usable product since no one really
knows what it would do.

I mean, it's one thing to say Here's what our intention what this
product behaves. But use it at your own risk since there is no
guarantee of any kind., but it's entirely different thing to say
Here's a bunch of code, but we don't know how it behaves, you're
welcome to find out for yourself and play around with it.

To me those two things put a product at two rather different level of
maturity. If codec falls in the latter, a normal user that needs a
codec library for their work might as well look some where else or
roll their own stuff to hand their codec needs, since it'd take same
or even more time to dig in commons-codec and try to find out what
exactly it's trying to do.

-Q





  on. I think that'd be much more effective. Any established channel
  where that can happen?

 If you find some bugs, then report them via JIRA.
 It helps if you can provide test cases; obviously patches as well is
 even better.

  Thanks,
  -Q
 
 
 
  On 10/7/07, Torsten Curdt [EMAIL PROTECTED] wrote:
Can the dev team make that happen? - a humble request from a user.
   
The think about open source is that there is no distinction between
developer and user.
   
That's interesting. Why then almost all open source projects have
users doc vs. developer doc, users mailing list vs. developers
mailing list?
  
   Point taken. But this is more about the presentation of information.
   Still the distinction is blurry as many developers and contributors
   come initially from the user background. They felt the itch to
   improve the code and ended up providing the fix/feature
   themselves ...I guess we are trying to get you to do the same :) You
   should see that more as an invitation to help.
  
The developers develop because they want to use the
code. And when somebody wants to use a feature that doesn't exist,
then
they can develop it.
   
In short, if you want this feature, you can do this yourself and
post a
patch to this list so it gets included in the next release. There
is no
paid dev team for any of the commons code.
   
OK i see. My intention was just to bring up an issue/request. There is
no saying the developers are obligated to work on it. And contributing
code is certainly not the only way of involving with an open source
project.
  
   That is of course true. But I guess what Simon was wanting to point
   out is that requesting a feature rarely means it gets done unless a
   developer needs that feature too. The personal TODO lists are usually
   very long. So usually you are better of contributing a patch
   yourself ...or pay someone to do it for you. That's how it works.
  
Nevertheless, if the dev team (unpaid as I understand it)
wants the project to be more popular, it might help listening to what
other users have to say other than themselves; otherwise just forget
about it.
  
   Well, sure ...listening is great - but someone gotta do it.
   There is no us vs you, no dev vs user there is just us.
  
   Does that make sense?
  
   cheers
   --
   Torsten
  
   -
   To unsubscribe, e-mail: [EMAIL PROTECTED]
   For additional commands, e-mail: [EMAIL PROTECTED]
  
  
 
  

Re: codec - thread safe

2007-10-06 Thread Qingtian Wang
Well, it's pick-your-poison kind of a deal. Either block on one
instance and take a performance hit, or burn up the memory with lots
of instances.

But in the case of BCodec, I think encode/decode is thread safe.
Unfortunately per Henri, that's not generally true for others.

Well, let me make it clear that I am a total layman on codec. But it
seems to me it's not that difficult to implement all the codec methods
in a thread safe manner, without sync blocks.

Can the dev team make that happen? - a humble request from a user.

Thanks,
-Q




On 10/6/07, ben short [EMAIL PROTECTED] wrote:
 How about...

 MyStringCodec {
BCodec delegate = new BCodec();

String encode(String in) {

String result = null;

synchronized(delegate)
{
result = delegate.encode(in);
}

return result;

}
 }


 On 10/6/07, Qingtian Wang [EMAIL PROTECTED] wrote:
  Henri,
 
  That was an unpleasant surprise.
 
  So what would be the general suggested program pattern to use the API
  if one wants to be thread safe?
 
  Is it 1:
 
  MyStringCodec {
  String encode(String in) {
  return new BCodec().encode(in);
  }
  }
 
  or 2:
 
  MyStringCodec {
  BCodec delegate = new BCodec();
 
  String encode(String in) {
  return delegate.encode(in);
  }
  }
 
  Option 2 apparently won't work in a multi-thread scenario if BCodec is
  thread unsafe. But option 1 really creates A LOT of BCodec objects in
  memory
 
  -Q
 
 
 
 
  On 10/5/07, Henri Yandell [EMAIL PROTECTED] wrote:
   On 9/29/07, Qingtian Wang [EMAIL PROTECTED] wrote:
I apologize for this question as it must have been asked a million
times: I was unable to search the mailing list archive for this.
   
Are all the encode/decode methods in commons-codec intended to be 
thread safe?
   
I peeked into the source code for a couple of those methods. They both
seem to be thread safe. But I'd rather ask here to make sure that's
the general intention when the code is written.
  
   I think you'll find it varies. The various algorithms come from
   different sources, and though a common interface was put on them all,
   I doubt that time was spent to make sure they were all threadsafe.
  
   Looking at the objects, I see a few with private attributes that are
   set at more than just construction time - so very likely that one or
   another is thread unsafe.
  
   Hen
  
   -
   To unsubscribe, e-mail: [EMAIL PROTECTED]
   For additional commands, e-mail: [EMAIL PROTECTED]
  
  
 
  -
  To unsubscribe, e-mail: [EMAIL PROTECTED]
  For additional commands, e-mail: [EMAIL PROTECTED]
 
 

 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]