Re: Q: 9.x upgrade to hppc 0.9.1

2024-05-26 Thread Mike Drob
What is the cost of maintaining the fork? I don’t feel it’s fair to you
Dawid, if we were to expect you to port over any changes made to hppc
upstream.

Mike

On Sun, May 26, 2024 at 3:59 PM Dawid Weiss  wrote:

> If we increase the hppc fork to 23 classes and 14 test classes, then we
>> can remove the hppc dependency from all modules.
>> Do we agree that we should
>> - Increase the fork size
>> - Move it to oal.internal
>> - Remove the hppc dependency from everywhere
>>
>
> Yes, I think it's the safest way to go and it's also the cleanest - keeps
> the implementation details private and doesn't clash with anything out
> there. Dropping an existing dependency shouldn't be a problem, I think.
>
>
>> Dawid, for the size of hppc, I counted the number of files with
>> find . -type f | wc -l
>> in hppc/build/generated/main
>>
>
> Oh, ok. Many of these are a bit esoteric (even though we don't generate
> all combinations). Taking what's needed sounds reasonable to me - and it
> shouldn't be that much, really.
>
> D.
>
>
>>
>> Le dim. 26 mai 2024 à 21:52, Dawid Weiss  a
>> écrit :
>>
>>>
>>> Hi Bruno,
>>>
>>> Currently the hppc fork in Lucene is composed of 15 classes and 8 test
 classes.
 Forking everything in hppc would mean 525 classes and 193 test classes.
 I'm not sure we want to fork all hppc?

>>>
>>> My superficial analysis hinted at far fewer classes but I'll take a look
>>> tomorrow, had a busy day today.
>>>
>>>
 +1 to moving the hppc fork to oal.internal.

>>>
>>> Yes, I think it's a good idea to move it and hide it, at least for the
>>> module system.
>>>
>>> D.
>>>
>>>


Re: Q: 9.x upgrade to hppc 0.9.1

2024-05-26 Thread Dawid Weiss
>
> If we increase the hppc fork to 23 classes and 14 test classes, then we
> can remove the hppc dependency from all modules.
> Do we agree that we should
> - Increase the fork size
> - Move it to oal.internal
> - Remove the hppc dependency from everywhere
>

Yes, I think it's the safest way to go and it's also the cleanest - keeps
the implementation details private and doesn't clash with anything out
there. Dropping an existing dependency shouldn't be a problem, I think.


> Dawid, for the size of hppc, I counted the number of files with
> find . -type f | wc -l
> in hppc/build/generated/main
>

Oh, ok. Many of these are a bit esoteric (even though we don't generate all
combinations). Taking what's needed sounds reasonable to me - and it
shouldn't be that much, really.

D.


>
> Le dim. 26 mai 2024 à 21:52, Dawid Weiss  a écrit :
>
>>
>> Hi Bruno,
>>
>> Currently the hppc fork in Lucene is composed of 15 classes and 8 test
>>> classes.
>>> Forking everything in hppc would mean 525 classes and 193 test classes.
>>> I'm not sure we want to fork all hppc?
>>>
>>
>> My superficial analysis hinted at far fewer classes but I'll take a look
>> tomorrow, had a busy day today.
>>
>>
>>> +1 to moving the hppc fork to oal.internal.
>>>
>>
>> Yes, I think it's a good idea to move it and hide it, at least for the
>> module system.
>>
>> D.
>>
>>


Re: Q: 9.x upgrade to hppc 0.9.1

2024-05-26 Thread Bruno Roustant
If we increase the hppc fork to 23 classes and 14 test classes, then we can
remove the hppc dependency from all modules.
Do we agree that we should
- Increase the fork size
- Move it to oal.internal
- Remove the hppc dependency from everywhere

I can send a PR for this soon.

Dawid, for the size of hppc, I counted the number of files with
find . -type f | wc -l
in hppc/build/generated/main

Le dim. 26 mai 2024 à 21:52, Dawid Weiss  a écrit :

>
> Hi Bruno,
>
> Currently the hppc fork in Lucene is composed of 15 classes and 8 test
>> classes.
>> Forking everything in hppc would mean 525 classes and 193 test classes.
>> I'm not sure we want to fork all hppc?
>>
>
> My superficial analysis hinted at far fewer classes but I'll take a look
> tomorrow, had a busy day today.
>
>
>> +1 to moving the hppc fork to oal.internal.
>>
>
> Yes, I think it's a good idea to move it and hide it, at least for the
> module system.
>
> D.
>
>


Re: Q: 9.x upgrade to hppc 0.9.1

2024-05-26 Thread Dawid Weiss
Hi Bruno,

Currently the hppc fork in Lucene is composed of 15 classes and 8 test
> classes.
> Forking everything in hppc would mean 525 classes and 193 test classes.
> I'm not sure we want to fork all hppc?
>

My superficial analysis hinted at far fewer classes but I'll take a look
tomorrow, had a busy day today.


> +1 to moving the hppc fork to oal.internal.
>

Yes, I think it's a good idea to move it and hide it, at least for the
module system.

D.


Re: Q: 9.x upgrade to hppc 0.9.1

2024-05-26 Thread Chris Hegarty
Hi David,

> On 25 May 2024, at 21:08, Dawid Weiss  wrote:
> 
> ...
> 
> I understand it's a pain if the order changes from run to run but I don't see 
> a way this can be avoided ([1] is the issue you mentioned on gh). Tests (and 
> code) shouldn't rely on map/set ordering, although I realize it may be 
> difficult to weed out in such a large codebase.

To be clear, I agree, the bug is in the Elasticsearch code - it should not 
depend upon iteration order of these collection types. And yes, it’s difficult 
to weed out and fix, which we’ll continue to work on.

> For what it's worth, the next version of HPPC will be a proper module (with 
> com.carrotsearch.hppc id). Would it change anything/ make it easier if I 
> renamed it to just 'hppc'?

Moving to an explicit module with a module-info sounds good. The name, 
com.carrotsearch.hppc, is a fine name for this. No need to revert to the 
automatic module name.

-Chris.
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Q: 9.x upgrade to hppc 0.9.1

2024-05-26 Thread Bruno Roustant
Currently the hppc fork in Lucene is composed of 15 classes and 8 test
classes.
Forking everything in hppc would mean 525 classes and 193 test classes. I'm
not sure we want to fork all hppc?

+1 to moving the hppc fork to oal.internal.

Le dim. 26 mai 2024 à 12:22, Uwe Schindler  a écrit :

> Hi,
>
> I was also wondering why parts of hppc were forked/copied to Lucene Core,
> others not. IMHO it should be consistent.
>
> I alaos agree that we should remove the classes completely from the util
> package (public part of API) and move them to the non-exported packages
> unter oal.internal. Of course this does not prevent classpath users form
> using those classes (P.S.: for the SharedSecrets and Vectorization theres
> stack inspection to prevent invalid callers from using them, but that's not
> needed for packages here as they cannot bring any risk for code when
> keeping public).
>
> +1 to move the classes and fork everything of HPPC to oal.internal package
> and only export it to specific modules in the module-info by a specific
> export (like for test.framework).
>
> Uwe
> Am 26.05.2024 um 10:31 schrieb Dawid Weiss:
>
>
> I will not have the time for this today but took a quick look and I think
> these external dependencies on hppc can be removed after the work Bruno has
> done to port some of these utility classes to the core. I'd also move the
> entire Lucene hppc fork under internal and only expose it to other Lucene
> modules that need it - would have to verify that no class is part of the
> public API but I don't think it is (in spatial3d and spatial-extras).
>
> Dawid
>
> On Sat, May 25, 2024 at 10:08 PM Dawid Weiss 
> wrote:
>
>>
>> Hi Chris,
>>
>> Since Elasticsearch is deployed as a module, then we need to update to
>>> hppc 0.9.1 [2], but unfortunately this is not straightforward. In fact,
>>> Ryan has a PR open [3] for the past 2 years without completion! The
>>> iteration order of some collection types in hppc 0.9.x [*] is tickling some
>>> inadvertent order dependencies in Elasticsearch. It may take some time to
>>> track these down and fix them.
>>>
>>
>> I understand it's a pain if the order changes from run to run but I don't
>> see a way this can be avoided ([1] is the issue you mentioned on gh). Tests
>> (and code) shouldn't rely on map/set ordering, although I realize it may be
>> difficult to weed out in such a large codebase.
>>
>> For what it's worth, the next version of HPPC will be a proper module
>> (with com.carrotsearch.hppc id). Would it change anything/ make it easier
>> if I renamed it to just 'hppc'?
>>
>> I wonder if others may run into either or both of these issues, as we
>>> have in Elasticsearch, if we release 9.11 with this change?
>>>
>>
>> That's why I wasn't entirely sold on having HPPC as the dependency from
>> Lucene when Bruno mentioned it recently - the jar/module hell will surface
>> sooner than later... Maybe it'd be a better idea to just copy what's needed
>> to the core jar and expose those packages to other Lucene modules (so that
>> there is no explicit dependency on HPPC at all)? Bruno copied a lot of
>> those classes already anyway - don't know how much of it is left to copy to
>> drop the dependency.
>>
>> Dawid
>>
>> [1] https://github.com/carrotsearch/hppc/issues/228
>> [2]
>> https://github.com/carrotsearch/hppc/commit/d569a8944091844c62349646f8eeaf35ebfb5ba6
>>
>>
>>>
>>> -Chris.
>>>
>>> [1] https://github.com/apache/lucene/pull/13392
>>> [2] https://github.com/elastic/elasticsearch/pull/109006
>>> [3] https://github.com/elastic/elasticsearch/pull/84168
>>>
>>> [*] HPPC-186: A different strategy has been implemented for collision
>>> avalanche avoidance. This results in removal of Scatter* maps and sets and
>>> their unification with their Hash* counterparts. This change should not
>>> affect any existing code unless it relied on static, specific ordering of
>>> keys. A side effect of this change is that key/value enumerators will
>>> return a different ordering of their container's values on each invocation.
>>> If your code relies on the order of values in associative arrays, it must
>>> order them after they are retrieved. (Bruno Roustant).
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>>
>>> --
> Uwe Schindler
> Achterdiek 19, D-28357 Bremenhttps://www.thetaphi.de
> eMail: u...@thetaphi.de
>
>


Re: Q: 9.x upgrade to hppc 0.9.1

2024-05-26 Thread Uwe Schindler

Hi,

I was also wondering why parts of hppc were forked/copied to Lucene 
Core, others not. IMHO it should be consistent.


I alaos agree that we should remove the classes completely from the util 
package (public part of API) and move them to the non-exported packages 
unter oal.internal. Of course this does not prevent classpath users form 
using those classes (P.S.: for the SharedSecrets and Vectorization 
theres stack inspection to prevent invalid callers from using them, but 
that's not needed for packages here as they cannot bring any risk for 
code when keeping public).


+1 to move the classes and fork everything of HPPC to oal.internal 
package and only export it to specific modules in the module-info by a 
specific export (like for test.framework).


Uwe

Am 26.05.2024 um 10:31 schrieb Dawid Weiss:


I will not have the time for this today but took a quick look and I 
think these external dependencies on hppc can be removed after the 
work Bruno has done to port some of these utility classes to the core. 
I'd also move the entire Lucene hppc fork under internal and only 
expose it to other Lucene modules that need it - would have to verify 
that no class is part of the public API but I don't think it is (in 
spatial3d and spatial-extras).


Dawid

On Sat, May 25, 2024 at 10:08 PM Dawid Weiss  
wrote:



Hi Chris,

Since Elasticsearch is deployed as a module, then we need to
update to hppc 0.9.1 [2], but unfortunately this is not
straightforward. In fact, Ryan has a PR open [3] for the past
2 years without completion! The iteration order of some
collection types in hppc 0.9.x [*] is tickling some
inadvertent order dependencies in Elasticsearch. It may take
some time to track these down and fix them.


I understand it's a pain if the order changes from run to run but
I don't see a way this can be avoided ([1] is the issue you
mentioned on gh). Tests (and code) shouldn't rely on map/set
ordering, although I realize it may be difficult to weed out in
such a large codebase.

For what it's worth, the next version of HPPC will be a proper
module (with com.carrotsearch.hppc id). Would it change anything/
make it easier if I renamed it to just 'hppc'?

I wonder if others may run into either or both of these
issues, as we have in Elasticsearch, if we release 9.11 with
this change?


That's why I wasn't entirely sold on having HPPC as the dependency
from Lucene when Bruno mentioned it recently - the jar/module hell
will surface sooner than later... Maybe it'd be a better idea to
just copy what's needed to the core jar and expose those packages
to other Lucene modules (so that there is no explicit dependency
on HPPC at all)? Bruno copied a lot of those classes already
anyway - don't know how much of it is left to copy to drop the
dependency.

Dawid

[1] https://github.com/carrotsearch/hppc/issues/228
[2]

https://github.com/carrotsearch/hppc/commit/d569a8944091844c62349646f8eeaf35ebfb5ba6


-Chris.

[1] https://github.com/apache/lucene/pull/13392
[2] https://github.com/elastic/elasticsearch/pull/109006
[3] https://github.com/elastic/elasticsearch/pull/84168

[*] HPPC-186: A different strategy has been implemented for
collision avalanche avoidance. This results in removal of
Scatter* maps and sets and their unification with their Hash*
counterparts. This change should not affect any existing code
unless it relied on static, specific ordering of keys. A side
effect of this change is that key/value enumerators will
return a different ordering of their container's values on
each invocation. If your code relies on the order of values in
associative arrays, it must order them after they are
retrieved. (Bruno Roustant).
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org


--
Uwe Schindler
Achterdiek 19, D-28357 Bremen
https://www.thetaphi.de
eMail:u...@thetaphi.de


Re: Q: 9.x upgrade to hppc 0.9.1

2024-05-26 Thread Bruno Roustant
I didn't copy all hppc, the Lucene hppc fork is limited.
I know there are some hppc classes used and not in the fork in the facet
module, which had the hppc jar dependency since a while ago. So maybe we
can keep this dependency?
For the new dependencies that I added to the join and spatial modules,
maybe we can remove it. But it probably requires adapting in some way to
use only the fork.

Bruno

Le dim. 26 mai 2024 à 10:32, Dawid Weiss  a écrit :

>
> I will not have the time for this today but took a quick look and I think
> these external dependencies on hppc can be removed after the work Bruno has
> done to port some of these utility classes to the core. I'd also move the
> entire Lucene hppc fork under internal and only expose it to other Lucene
> modules that need it - would have to verify that no class is part of the
> public API but I don't think it is (in spatial3d and spatial-extras).
>
> Dawid
>
> On Sat, May 25, 2024 at 10:08 PM Dawid Weiss 
> wrote:
>
>>
>> Hi Chris,
>>
>> Since Elasticsearch is deployed as a module, then we need to update to
>>> hppc 0.9.1 [2], but unfortunately this is not straightforward. In fact,
>>> Ryan has a PR open [3] for the past 2 years without completion! The
>>> iteration order of some collection types in hppc 0.9.x [*] is tickling some
>>> inadvertent order dependencies in Elasticsearch. It may take some time to
>>> track these down and fix them.
>>>
>>
>> I understand it's a pain if the order changes from run to run but I don't
>> see a way this can be avoided ([1] is the issue you mentioned on gh). Tests
>> (and code) shouldn't rely on map/set ordering, although I realize it may be
>> difficult to weed out in such a large codebase.
>>
>> For what it's worth, the next version of HPPC will be a proper module
>> (with com.carrotsearch.hppc id). Would it change anything/ make it easier
>> if I renamed it to just 'hppc'?
>>
>> I wonder if others may run into either or both of these issues, as we
>>> have in Elasticsearch, if we release 9.11 with this change?
>>>
>>
>> That's why I wasn't entirely sold on having HPPC as the dependency from
>> Lucene when Bruno mentioned it recently - the jar/module hell will surface
>> sooner than later... Maybe it'd be a better idea to just copy what's needed
>> to the core jar and expose those packages to other Lucene modules (so that
>> there is no explicit dependency on HPPC at all)? Bruno copied a lot of
>> those classes already anyway - don't know how much of it is left to copy to
>> drop the dependency.
>>
>> Dawid
>>
>> [1] https://github.com/carrotsearch/hppc/issues/228
>> [2]
>> https://github.com/carrotsearch/hppc/commit/d569a8944091844c62349646f8eeaf35ebfb5ba6
>>
>>
>>>
>>> -Chris.
>>>
>>> [1] https://github.com/apache/lucene/pull/13392
>>> [2] https://github.com/elastic/elasticsearch/pull/109006
>>> [3] https://github.com/elastic/elasticsearch/pull/84168
>>>
>>> [*] HPPC-186: A different strategy has been implemented for collision
>>> avalanche avoidance. This results in removal of Scatter* maps and sets and
>>> their unification with their Hash* counterparts. This change should not
>>> affect any existing code unless it relied on static, specific ordering of
>>> keys. A side effect of this change is that key/value enumerators will
>>> return a different ordering of their container's values on each invocation.
>>> If your code relies on the order of values in associative arrays, it must
>>> order them after they are retrieved. (Bruno Roustant).
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>>
>>>


Re: Q: 9.x upgrade to hppc 0.9.1

2024-05-26 Thread Dawid Weiss
I will not have the time for this today but took a quick look and I think
these external dependencies on hppc can be removed after the work Bruno has
done to port some of these utility classes to the core. I'd also move the
entire Lucene hppc fork under internal and only expose it to other Lucene
modules that need it - would have to verify that no class is part of the
public API but I don't think it is (in spatial3d and spatial-extras).

Dawid

On Sat, May 25, 2024 at 10:08 PM Dawid Weiss  wrote:

>
> Hi Chris,
>
> Since Elasticsearch is deployed as a module, then we need to update to
>> hppc 0.9.1 [2], but unfortunately this is not straightforward. In fact,
>> Ryan has a PR open [3] for the past 2 years without completion! The
>> iteration order of some collection types in hppc 0.9.x [*] is tickling some
>> inadvertent order dependencies in Elasticsearch. It may take some time to
>> track these down and fix them.
>>
>
> I understand it's a pain if the order changes from run to run but I don't
> see a way this can be avoided ([1] is the issue you mentioned on gh). Tests
> (and code) shouldn't rely on map/set ordering, although I realize it may be
> difficult to weed out in such a large codebase.
>
> For what it's worth, the next version of HPPC will be a proper module
> (with com.carrotsearch.hppc id). Would it change anything/ make it easier
> if I renamed it to just 'hppc'?
>
> I wonder if others may run into either or both of these issues, as we have
>> in Elasticsearch, if we release 9.11 with this change?
>>
>
> That's why I wasn't entirely sold on having HPPC as the dependency from
> Lucene when Bruno mentioned it recently - the jar/module hell will surface
> sooner than later... Maybe it'd be a better idea to just copy what's needed
> to the core jar and expose those packages to other Lucene modules (so that
> there is no explicit dependency on HPPC at all)? Bruno copied a lot of
> those classes already anyway - don't know how much of it is left to copy to
> drop the dependency.
>
> Dawid
>
> [1] https://github.com/carrotsearch/hppc/issues/228
> [2]
> https://github.com/carrotsearch/hppc/commit/d569a8944091844c62349646f8eeaf35ebfb5ba6
>
>
>>
>> -Chris.
>>
>> [1] https://github.com/apache/lucene/pull/13392
>> [2] https://github.com/elastic/elasticsearch/pull/109006
>> [3] https://github.com/elastic/elasticsearch/pull/84168
>>
>> [*] HPPC-186: A different strategy has been implemented for collision
>> avalanche avoidance. This results in removal of Scatter* maps and sets and
>> their unification with their Hash* counterparts. This change should not
>> affect any existing code unless it relied on static, specific ordering of
>> keys. A side effect of this change is that key/value enumerators will
>> return a different ordering of their container's values on each invocation.
>> If your code relies on the order of values in associative arrays, it must
>> order them after they are retrieved. (Bruno Roustant).
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>