[RESULT] [VOTE] Apache Sling Installer Configuration Factory 1.4.6

2024-08-21 Thread Stefan Egli

Hi,

The vote has passed with the following result :

+1 (binding): Robert Munteanu, Julian Sedding, Radu Cotescu, Jörg Hoh
+1 (non binding): none

I will copy this release to the Sling dist directory and
promote the artifacts to the central Maven repository.

Cheers,
Stefan

On 15.08.24 10:33, Stefan Egli wrote:

Hi,

We solved 3 issues in this release:

https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310710&version=12353244&styleName=Text 



Staging repository:

https://repository.apache.org/content/repositories/orgapachesling-2881

You can use this UNIX script to download the release and verify the 
signatures:


https://raw.githubusercontent.com/apache/sling-tooling-release/master/check_staged_release.sh 



Usage:
sh check_staged_release.sh 2881 /tmp/sling-staging

Please vote to approve this release:

  [ ] +1 Approve the release
  [ ]  0 Don't care
  [ ] -1 Don't release, because ...

This majority vote is open for at least 72 hours.

Cheers,
Stefan



[VOTE] Apache Sling Installer Configuration Factory 1.4.6

2024-08-15 Thread Stefan Egli

Hi,

We solved 3 issues in this release:

https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310710&version=12353244&styleName=Text

Staging repository:

https://repository.apache.org/content/repositories/orgapachesling-2881

You can use this UNIX script to download the release and verify the signatures:

https://raw.githubusercontent.com/apache/sling-tooling-release/master/check_staged_release.sh

Usage:
sh check_staged_release.sh 2881 /tmp/sling-staging

Please vote to approve this release:

  [ ] +1 Approve the release
  [ ]  0 Don't care
  [ ] -1 Don't release, because ...

This majority vote is open for at least 72 hours.

Cheers,
Stefan



Re: Sling Jobs & Eventing Web Console

2024-04-03 Thread Stefan Egli

Hi Konrad,

Suspend stops processing of new jobs but finishes already started jobs 
(note that the default implementation suspends for 60min, then resumes 
the queue). This has no influence on enqueuing, so jobs can still be 
enqueued.


I'm not aware of any helper that goes through the stack trace to find 
which code added a job. There is (debug) logging though, so that might 
be of some help.


Cheers,
Stefan

On 02.04.24 18:04, Konrad Windszus wrote:

Hi
I currently face an issue with jobs which get constantly queued (unclear by 
whom yet).
In order to bring back back the server to a stable state I want to leverage the 
web console at /system/console/slingevent which offers 4 actions per each 
active job queue:

Reset Stats
Resume/Suspend
Test
Drop Alll

(https://github.com/apache/sling-org-apache-sling-event/blob/71c6d4b3219adb640fa5628fb31cad84d31eff2b/src/main/java/org/apache/sling/event/impl/jobs/console/WebConsolePlugin.java#L297-L304)

What happens exactly if I suspend a queue? Do jobs still get enqueued on the 
suspended queue? Is there any helper available to figure out which code is 
responsible for adding jobs?
Thanks in advance

Konrad







Re: [VOTE] Release Apache Sling Commons JSON 2.0.24

2024-03-28 Thread Stefan Egli

+1,


Cheers,

Stefan

On 28.03.24 02:23, Daniel Klco wrote:

+1

On Wed, Mar 27, 2024 at 10:11 AM Robert Munteanu  wrote:


Hi,

We need two more binding votes for the release to pass.

Thanks,
Robert



[jira] [Created] (SLING-12236) Introduce config option to bypass oak's DataStore deduplication for job properties

2024-01-22 Thread Stefan Egli (Jira)
Stefan Egli created SLING-12236:
---

 Summary: Introduce config option to bypass oak's DataStore 
deduplication for job properties
 Key: SLING-12236
 URL: https://issues.apache.org/jira/browse/SLING-12236
 Project: Sling
  Issue Type: Task
  Components: Event
Reporter: Stefan Egli


When a sling job is created, its properties are persisted using 
ResourceHelper.getOrCreateResource. Typically the property values would be 
primitive types or short Strings and thus be embedded. For larger property 
values they might be stored as binaries by the underlying DataStore. If for 
some reason different jobs contain identical property values (i.e. binaries), 
then they are deduplicated by DataStore. If such identical binaries are 
concurrently read and written by different Sling instances (as could happen if 
the job queue is not ORDERED and if identical property binaries are in play in 
the first place), then DataStore could run into concurrency issues with 
reading/writing the same binary. That could manifest in sling job eg as a 
ClassNotFoundException.

This situation could either be avoided by the application ensuring not to have 
such duplicate job binaries. 

Alternatively sling job could consider introducing a job queue configuration 
that would artificially make binaries unique (by eg prepending a hidden UUID).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (SLING-11900) Provide alternative terminology for inequitable terms

2023-10-16 Thread Stefan Egli (Jira)


 [ 
https://issues.apache.org/jira/browse/SLING-11900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli closed SLING-11900.
---

> Provide alternative terminology for inequitable terms
> -
>
> Key: SLING-11900
> URL: https://issues.apache.org/jira/browse/SLING-11900
> Project: Sling
>  Issue Type: Improvement
>  Components: Event
>Reporter: Carsten Ziegeler
>Assignee: Carsten Ziegeler
>Priority: Major
> Fix For: Event 4.3.14
>
>
> The configuration for the jobs is using white/black list which is considered 
> inequitable terminology. Therefore, some more acceptable equivalents should 
> be provided for these terms. The proposal is to switch to allow/deny list



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (SLING-11923) Sling Events does not Build on Java 17

2023-10-16 Thread Stefan Egli (Jira)


 [ 
https://issues.apache.org/jira/browse/SLING-11923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli closed SLING-11923.
---

> Sling Events does not Build on Java 17
> --
>
> Key: SLING-11923
> URL: https://issues.apache.org/jira/browse/SLING-11923
> Project: Sling
>  Issue Type: Bug
>  Components: Event
>Affects Versions: Event 4.3.12
>Reporter: Dan Klco
>Assignee: Rishabh Daim
>Priority: Major
> Fix For: Event 4.3.14
>
>
> Attempting to build Sling Events with Java 17 fails with:
> {code:java}
> [main] INFO org.apache.jackrabbit.oak.plugins.index.IndexUpdate - Reindexing 
> completed
> [ERROR] Tests run: 4, Failures: 0, Errors: 4, Skipped: 0, Time elapsed: 1.935 
> s <<< FAILURE! - in org.apache.sling.event.impl.jobs.queues.TestTopicHalting
> [ERROR] 
> org.apache.sling.event.impl.jobs.queues.TestTopicHalting.testUnhalting  Time 
> elapsed: 1.506 s  <<< ERROR!
> java.lang.NoClassDefFoundError: java/security/acl/Group
>   at java.base/java.lang.ClassLoader.defineClass1(Native Method)
>   at java.base/java.lang.ClassLoader.defineClass(ClassLoader.java:1012)
>   at 
> java.base/java.security.SecureClassLoader.defineClass(SecureClassLoader.java:150)
>   at 
> java.base/jdk.internal.loader.BuiltinClassLoader.defineClass(BuiltinClassLoader.java:862)
>   at 
> java.base/jdk.internal.loader.BuiltinClassLoader.findClassOnClassPathOrNull(BuiltinClassLoader.java:760)
>   at 
> java.base/jdk.internal.loader.BuiltinClassLoader.loadClassOrNull(BuiltinClassLoader.java:681)
>   at 
> java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:639)
>  
> {code}
> This class is deprecated for removal in Java 11: 
> https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/security/acl/Group.html



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (SLING-11918) GaugeSupport has infinite recursion in registerWithSuffix

2023-10-16 Thread Stefan Egli (Jira)


 [ 
https://issues.apache.org/jira/browse/SLING-11918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli closed SLING-11918.
---

> GaugeSupport has infinite recursion in registerWithSuffix
> -
>
> Key: SLING-11918
> URL: https://issues.apache.org/jira/browse/SLING-11918
> Project: Sling
>  Issue Type: Bug
>  Components: Event
>Affects Versions: Event 4.3.8
>Reporter: Patrique Legault
>Priority: Critical
> Fix For: Event 4.3.14
>
>
> This exception occurs on a system with an unknown but particular 
> configuration but none the less causes the system to become unusable.
>  
> {code:java}
> (java.lang.StackOverflowError: Delayed StackOverflowError due to  
> ReservedStackAccess annotated method)
>     at 
> java.base/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1239)
>     at 
> java.base/java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:959)
>     at 
> java.management/com.sun.jmx.mbeanserver.Repository.addMBean(Repository.java:415)
>     at 
> java.management/com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerWithRepository(DefaultMBeanServerInterceptor.java:1855)
>     at 
> java.management/com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerDynamicMBean(DefaultMBeanServerInterceptor.java:955)
>     at 
> java.management/com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:890)
>     at 
> java.management/com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:320)
>     at 
> java.management/com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522)
>     at 
> com.codahale.metrics.JmxReporter$JmxListener.registerMBean(JmxReporter.java:510)
>  [io.dropwizard.metrics.core:3.2.4]
>     at 
> com.codahale.metrics.JmxReporter$JmxListener.onGaugeAdded(JmxReporter.java:535)
>  [io.dropwizard.metrics.core:3.2.4]
>     at 
> com.codahale.metrics.MetricRegistry.notifyListenerOfAddedMetric(MetricRegistry.java:454)
>  [io.dropwizard.metrics.core:3.2.4]
>     at 
> com.codahale.metrics.MetricRegistry.onMetricAdded(MetricRegistry.java:448) 
> [io.dropwizard.metrics.core:3.2.4]
>     at com.codahale.metrics.MetricRegistry.register(MetricRegistry.java:89) 
> [io.dropwizard.metrics.core:3.2.4]
>     at 
> org.apache.sling.event.impl.jobs.stats.GaugeSupport.registerWithSuffix(GaugeSupport.java:150)
>  [org.apache.sling.event:4.3.8]
>     at 
> org.apache.sling.event.impl.jobs.stats.GaugeSupport.registerWithSuffix(GaugeSupport.java:154)
>  [org.apache.sling.event:4.3.8] {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[RESULT][Vote] Release Apache Sling Event 4.3.14

2023-10-16 Thread Stefan Egli
|Hi, The vote has passed with the following result : +1 (binding): 
Carsten Ziegeler, Robert Munteanu, Daniel Klco and Andrei Dulvac +1 (non 
binding): none I will copy this release to the Sling dist directory and 
promote the artifacts to the central Maven repository. Cheers, Stefan |


On 09.10.23 13:27, Stefan Egli wrote:

Hi,

We solved 3 issues in this release:
https://issues.apache.org/jira/browse/SLING/fixforversion/12353217

** Bug
    * [SLING-11918] - GaugeSupport has infinite recursion in 
registerWithSuffix

    * [SLING-11923] - Sling Events does not Build on Java 17

** Improvement
    * [SLING-11900] - Provide alternative terminology for inequitable 
terms



Staging repository:
https://repository.apache.org/content/repositories/orgapachesling-2793/

You can use this UNIX script to download the release and verify the 
signatures:
https://raw.githubusercontent.com/apache/sling-tooling-release/master/check_staged_release.sh 



Usage:
sh check_staged_release.sh 2793 /tmp/sling-staging

Please vote to approve this release:

  [ ] +1 Approve the release
  [ ]  0 Don't care
  [ ] -1 Don't release, because ...

This majority vote is open for at least 72 hours.

Cheers,
Stefan

Re: [Vote] Release Apache Sling Event 4.3.14

2023-10-09 Thread Stefan Egli

Thx for the heads-up Daniel!

Could be another case of SLING-12078 - added a 2sec temporary startup delay there as well, let's see 
if that fixes things (to be removed once we fix SLING-12078)


Cheers,
Stefan

On 09.10.23 16:32, Daniel Klco wrote:

Thanks for checking Robert!

+1

On Mon, Oct 9, 2023 at 10:28 AM Robert Munteanu  wrote:


On Mon, 2023-10-09 at 10:16 -0400, Daniel Klco wrote:

Is there a concern that an IT on the Windows Java 17 build timed out?
https://ci-builds.apache.org/blue/organizations/jenkins/Sling%2Fmodules%2Fsling-org-apache-sling-event/detail/master/230/pipeline/61



I see that the latest build passed

https://ci-builds.apache.org/blue/organizations/jenkins/Sling%2Fmodules%2Fsling-org-apache-sling-event/detail/master/232/pipeline

It also passed for me locally (Java 17, Linux) so I would classify this
as a flaky test.

Robert


[Vote] Release Apache Sling Event 4.3.14

2023-10-09 Thread Stefan Egli

Hi,

We solved 3 issues in this release:
https://issues.apache.org/jira/browse/SLING/fixforversion/12353217

** Bug
* [SLING-11918] - GaugeSupport has infinite recursion in registerWithSuffix
* [SLING-11923] - Sling Events does not Build on Java 17

** Improvement
* [SLING-11900] - Provide alternative terminology for inequitable terms


Staging repository:
https://repository.apache.org/content/repositories/orgapachesling-2793/

You can use this UNIX script to download the release and verify the signatures:
https://raw.githubusercontent.com/apache/sling-tooling-release/master/check_staged_release.sh

Usage:
sh check_staged_release.sh 2793 /tmp/sling-staging

Please vote to approve this release:

  [ ] +1 Approve the release
  [ ]  0 Don't care
  [ ] -1 Don't release, because ...

This majority vote is open for at least 72 hours.

Cheers,
Stefan


[jira] [Updated] (SLING-11422) Stop embedding the event.api package in the event bundle

2023-10-09 Thread Stefan Egli (Jira)


 [ 
https://issues.apache.org/jira/browse/SLING-11422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli updated SLING-11422:

Fix Version/s: Event 4.3.16
   (was: Event 4.3.14)

> Stop embedding the event.api package in the event bundle
> 
>
> Key: SLING-11422
> URL: https://issues.apache.org/jira/browse/SLING-11422
> Project: Sling
>  Issue Type: Improvement
>  Components: Event
>Reporter: Robert Munteanu
>Priority: Major
> Fix For: Event 4.3.16
>
>
> As discussed in SLING-9664, deploying the Sling Event and Event API bundles 
> separately would be more in line with how we deploy bundles and also fix the 
> Javadoc generation.
> We should make this a minor version bump for the event bundle, to make it 
> clear that deployers need to adapt. Probably the baselining mechanism will 
> complain, but it's something we can ignore for the release.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (SLING-9664) org.apache.sling.event.jobs package not present in javadoc for sling10+

2023-10-09 Thread Stefan Egli (Jira)


 [ 
https://issues.apache.org/jira/browse/SLING-9664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli updated SLING-9664:
---
Fix Version/s: Event 4.3.16
   (was: Event 4.3.14)

> org.apache.sling.event.jobs package not present in javadoc for sling10+
> ---
>
> Key: SLING-9664
> URL: https://issues.apache.org/jira/browse/SLING-9664
> Project: Sling
>  Issue Type: Improvement
>  Components: Event
>Reporter: Joerg Hoh
>Priority: Major
> Fix For: Event 4.3.16
>
>
> While the javadoc for sling9 [1] cover the org.apache.sling.event.jobs 
> package(s), they went missing with the sling10 javadoc [2] and subsequent 
> versions.
> [1] https://sling.apache.org/apidocs/sling9/index.html
> [2] https://sling.apache.org/apidocs/sling10/index.html



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (SLING-12078) Suspected race condition between TOPOLOGY_INIT and JobManager.addJob

2023-10-05 Thread Stefan Egli (Jira)


 [ 
https://issues.apache.org/jira/browse/SLING-12078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli updated SLING-12078:

Description: 
Two regular cases where a job is stored as part of JobManager.addJob():
 * when a topology is defined, it directly gets stored to the appropriate 
assigned/target slingId subtree. This is the most frequent case by far.
 * if no topology is defined (no TOPOLOGY_INIT received) yet, it gets put into 
the unassigned subtree. Later upon receiving TOPOLOGY_INIT 
CheckTopologyTask.fullRun() finds such unassigned jobs and moves them to the 
corresponding assigned subtree.

There is a suspect race condition (test case to be provided), which happens 
between the thread doing JobManager.addJob() and the thread handling the 
TOPOLOGY_INIT:
 * JobManager.addJob determines the target slingId - which is not yet defined, 
as TOPOLOGY_INIT is just being handled concurrently
 * CheckTopologyTask.fullRun(), as part of TOPOLOGY_INIT handling, however does 
not yet find the above new job in unassigned, as the job is just being stored 
concurrently.

The result is a job in the unassigned subtree, which waits until the next 
TopologyEvent happens - which then invokes CheckTopologyTask.fullRun() - which 
then finds the unassigned job and re/assigns it accordingly. So the job is 
never lost, but substantially delayed due to this. (the frequency of 
TopologyEvents depends on actual cluster/property changes happening in the 
topology and can thus vary).

Tasks:
* provide a test case to reproduce
* fix the race-condition
* undo 
[this|https://github.com/apache/sling-org-apache-sling-event/commit/d16686705908099b26d0a3233f61c4e209880f93]
 and 
[this|https://github.com/apache/sling-org-apache-sling-event/commit/dea04990b770a92f29c2504aa33d8158d68da58f]
 commit

  was:
Two regular cases where a job is stored as part of JobManager.addJob():
 * when a topology is defined, it directly gets stored to the appropriate 
assigned/target slingId subtree. This is the most frequent case by far.
 * if no topology is defined (no TOPOLOGY_INIT received) yet, it gets put into 
the unassigned subtree. Later upon receiving TOPOLOGY_INIT 
CheckTopologyTask.fullRun() finds such unassigned jobs and moves them to the 
corresponding assigned subtree.

There is a suspect race condition (test case to be provided), which happens 
between the thread doing JobManager.addJob() and the thread handling the 
TOPOLOGY_INIT:
 * JobManager.addJob determines the target slingId - which is not yet defined, 
as TOPOLOGY_INIT is just being handled concurrently
 * CheckTopologyTask.fullRun(), as part of TOPOLOGY_INIT handling, however does 
not yet find the above new job in unassigned, as the job is just being stored 
concurrently.

The result is a job in the unassigned subtree, which waits until the next 
TopologyEvent happens - which then invokes CheckTopologyTask.fullRun() - which 
then finds the unassigned job and re/assigns it accordingly. So the job is 
never lost, but substantially delayed due to this. (the frequency of 
TopologyEvents depends on actual cluster/property changes happening in the 
topology and can thus vary).

Tasks:
* provide a test case to reproduce
* fix the race-condition
* undo 
[this|https://github.com/apache/sling-org-apache-sling-event/commit/d16686705908099b26d0a3233f61c4e209880f93|
 and 
[this|https://github.com/apache/sling-org-apache-sling-event/commit/dea04990b770a92f29c2504aa33d8158d68da58f]
 commit


> Suspected race condition between TOPOLOGY_INIT and JobManager.addJob
> 
>
> Key: SLING-12078
> URL: https://issues.apache.org/jira/browse/SLING-12078
> Project: Sling
>  Issue Type: Bug
>  Components: Event
>Affects Versions: Event 4.3.12
>    Reporter: Stefan Egli
>Priority: Major
>
> Two regular cases where a job is stored as part of JobManager.addJob():
>  * when a topology is defined, it directly gets stored to the appropriate 
> assigned/target slingId subtree. This is the most frequent case by far.
>  * if no topology is defined (no TOPOLOGY_INIT received) yet, it gets put 
> into the unassigned subtree. Later upon receiving TOPOLOGY_INIT 
> CheckTopologyTask.fullRun() finds such unassigned jobs and moves them to the 
> corresponding assigned subtree.
> There is a suspect race condition (test case to be provided), which happens 
> between the thread doing JobManager.addJob() and the thread handling the 
> TOPOLOGY_INIT:
>  * JobManager.addJob determines the target slingId - which is not yet 
> defined, as TOPOLOGY_INIT is just being handled concurrently
>  * CheckTopologyTask.fullRun(), as part of TOPOLOGY_INIT handling, however 
> does not yet find the above new job in unassigned, as the job

[jira] [Updated] (SLING-12078) Suspected race condition between TOPOLOGY_INIT and JobManager.addJob

2023-10-05 Thread Stefan Egli (Jira)


 [ 
https://issues.apache.org/jira/browse/SLING-12078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli updated SLING-12078:

Description: 
Two regular cases where a job is stored as part of JobManager.addJob():
 * when a topology is defined, it directly gets stored to the appropriate 
assigned/target slingId subtree. This is the most frequent case by far.
 * if no topology is defined (no TOPOLOGY_INIT received) yet, it gets put into 
the unassigned subtree. Later upon receiving TOPOLOGY_INIT 
CheckTopologyTask.fullRun() finds such unassigned jobs and moves them to the 
corresponding assigned subtree.

There is a suspect race condition (test case to be provided), which happens 
between the thread doing JobManager.addJob() and the thread handling the 
TOPOLOGY_INIT:
 * JobManager.addJob determines the target slingId - which is not yet defined, 
as TOPOLOGY_INIT is just being handled concurrently
 * CheckTopologyTask.fullRun(), as part of TOPOLOGY_INIT handling, however does 
not yet find the above new job in unassigned, as the job is just being stored 
concurrently.

The result is a job in the unassigned subtree, which waits until the next 
TopologyEvent happens - which then invokes CheckTopologyTask.fullRun() - which 
then finds the unassigned job and re/assigns it accordingly. So the job is 
never lost, but substantially delayed due to this. (the frequency of 
TopologyEvents depends on actual cluster/property changes happening in the 
topology and can thus vary).

Tasks:
* provide a test case to reproduce
* fix the race-condition
* undo 
[this|https://github.com/apache/sling-org-apache-sling-event/commit/d16686705908099b26d0a3233f61c4e209880f93|
 and 
[this|https://github.com/apache/sling-org-apache-sling-event/commit/dea04990b770a92f29c2504aa33d8158d68da58f]
 commit

  was:
Two regular cases where a job is stored as part of JobManager.addJob():
 * when a topology is defined, it directly gets stored to the appropriate 
assigned/target slingId subtree. This is the most frequent case by far.
 * if no topology is defined (no TOPOLOGY_INIT received) yet, it gets put into 
the unassigned subtree. Later upon receiving TOPOLOGY_INIT 
CheckTopologyTask.fullRun() finds such unassigned jobs and moves them to the 
corresponding assigned subtree.

There is a suspect race condition (test case to be provided), which happens 
between the thread doing JobManager.addJob() and the thread handling the 
TOPOLOGY_INIT:
 * JobManager.addJob determines the target slingId - which is not yet defined, 
as TOPOLOGY_INIT is just being handled concurrently
 * CheckTopologyTask.fullRun(), as part of TOPOLOGY_INIT handling, however does 
not yet find the above new job in unassigned, as the job is just being stored 
concurrently.

The result is a job in the unassigned subtree, which waits until the next 
TopologyEvent happens - which then invokes CheckTopologyTask.fullRun() - which 
then finds the unassigned job and re/assigns it accordingly. So the job is 
never lost, but substantially delayed due to this. (the frequency of 
TopologyEvents depends on actual cluster/property changes happening in the 
topology and can thus vary).

Tasks:
* provide a test case to reproduce
* fix the race-condition
* undo 
https://github.com/apache/sling-org-apache-sling-event/commit/d16686705908099b26d0a3233f61c4e209880f93


> Suspected race condition between TOPOLOGY_INIT and JobManager.addJob
> 
>
> Key: SLING-12078
> URL: https://issues.apache.org/jira/browse/SLING-12078
> Project: Sling
>  Issue Type: Bug
>  Components: Event
>Affects Versions: Event 4.3.12
>    Reporter: Stefan Egli
>Priority: Major
>
> Two regular cases where a job is stored as part of JobManager.addJob():
>  * when a topology is defined, it directly gets stored to the appropriate 
> assigned/target slingId subtree. This is the most frequent case by far.
>  * if no topology is defined (no TOPOLOGY_INIT received) yet, it gets put 
> into the unassigned subtree. Later upon receiving TOPOLOGY_INIT 
> CheckTopologyTask.fullRun() finds such unassigned jobs and moves them to the 
> corresponding assigned subtree.
> There is a suspect race condition (test case to be provided), which happens 
> between the thread doing JobManager.addJob() and the thread handling the 
> TOPOLOGY_INIT:
>  * JobManager.addJob determines the target slingId - which is not yet 
> defined, as TOPOLOGY_INIT is just being handled concurrently
>  * CheckTopologyTask.fullRun(), as part of TOPOLOGY_INIT handling, however 
> does not yet find the above new job in unassigned, as the job is just being 
> stored concurrently.
> The result is a job in the unassigned subtree, which waits until the next 
> TopologyE

[jira] [Commented] (SLING-12078) Suspected race condition between TOPOLOGY_INIT and JobManager.addJob

2023-10-05 Thread Stefan Egli (Jira)


[ 
https://issues.apache.org/jira/browse/SLING-12078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17772259#comment-17772259
 ] 

Stefan Egli commented on SLING-12078:
-

* additional IT test affected, added [same workaround 
there|https://github.com/apache/sling-org-apache-sling-event/commit/dea04990b770a92f29c2504aa33d8158d68da58f]

> Suspected race condition between TOPOLOGY_INIT and JobManager.addJob
> 
>
> Key: SLING-12078
> URL: https://issues.apache.org/jira/browse/SLING-12078
> Project: Sling
>  Issue Type: Bug
>  Components: Event
>Affects Versions: Event 4.3.12
>Reporter: Stefan Egli
>Priority: Major
>
> Two regular cases where a job is stored as part of JobManager.addJob():
>  * when a topology is defined, it directly gets stored to the appropriate 
> assigned/target slingId subtree. This is the most frequent case by far.
>  * if no topology is defined (no TOPOLOGY_INIT received) yet, it gets put 
> into the unassigned subtree. Later upon receiving TOPOLOGY_INIT 
> CheckTopologyTask.fullRun() finds such unassigned jobs and moves them to the 
> corresponding assigned subtree.
> There is a suspect race condition (test case to be provided), which happens 
> between the thread doing JobManager.addJob() and the thread handling the 
> TOPOLOGY_INIT:
>  * JobManager.addJob determines the target slingId - which is not yet 
> defined, as TOPOLOGY_INIT is just being handled concurrently
>  * CheckTopologyTask.fullRun(), as part of TOPOLOGY_INIT handling, however 
> does not yet find the above new job in unassigned, as the job is just being 
> stored concurrently.
> The result is a job in the unassigned subtree, which waits until the next 
> TopologyEvent happens - which then invokes CheckTopologyTask.fullRun() - 
> which then finds the unassigned job and re/assigns it accordingly. So the job 
> is never lost, but substantially delayed due to this. (the frequency of 
> TopologyEvents depends on actual cluster/property changes happening in the 
> topology and can thus vary).
> Tasks:
> * provide a test case to reproduce
> * fix the race-condition
> * undo 
> [this|https://github.com/apache/sling-org-apache-sling-event/commit/d16686705908099b26d0a3233f61c4e209880f93]
>  and 
> [this|https://github.com/apache/sling-org-apache-sling-event/commit/dea04990b770a92f29c2504aa33d8158d68da58f]
>  commit



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (SLING-12078) Suspected race condition between TOPOLOGY_INIT and JobManager.addJob

2023-10-05 Thread Stefan Egli (Jira)


 [ 
https://issues.apache.org/jira/browse/SLING-12078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli updated SLING-12078:

Description: 
Two regular cases where a job is stored as part of JobManager.addJob():
 * when a topology is defined, it directly gets stored to the appropriate 
assigned/target slingId subtree. This is the most frequent case by far.
 * if no topology is defined (no TOPOLOGY_INIT received) yet, it gets put into 
the unassigned subtree. Later upon receiving TOPOLOGY_INIT 
CheckTopologyTask.fullRun() finds such unassigned jobs and moves them to the 
corresponding assigned subtree.

There is a suspect race condition (test case to be provided), which happens 
between the thread doing JobManager.addJob() and the thread handling the 
TOPOLOGY_INIT:
 * JobManager.addJob determines the target slingId - which is not yet defined, 
as TOPOLOGY_INIT is just being handled concurrently
 * CheckTopologyTask.fullRun(), as part of TOPOLOGY_INIT handling, however does 
not yet find the above new job in unassigned, as the job is just being stored 
concurrently.

The result is a job in the unassigned subtree, which waits until the next 
TopologyEvent happens - which then invokes CheckTopologyTask.fullRun() - which 
then finds the unassigned job and re/assigns it accordingly. So the job is 
never lost, but substantially delayed due to this. (the frequency of 
TopologyEvents depends on actual cluster/property changes happening in the 
topology and can thus vary).

Tasks:
* provide a test case to reproduce
* fix the race-condition
* undo 
https://github.com/apache/sling-org-apache-sling-event/commit/d16686705908099b26d0a3233f61c4e209880f93

  was:
Two regular cases where a job is stored as part of JobManager.addJob():
 * when a topology is defined, it directly gets stored to the appropriate 
assigned/target slingId subtree. This is the most frequent case by far.
 * if no topology is defined (no TOPOLOGY_INIT received) yet, it gets put into 
the unassigned subtree. Later upon receiving TOPOLOGY_INIT 
CheckTopologyTask.fullRun() finds such unassigned jobs and moves them to the 
corresponding assigned subtree.

There is a suspect race condition (test case to be provided), which happens 
between the thread doing JobManager.addJob() and the thread handling the 
TOPOLOGY_INIT:
 * JobManager.addJob determines the target slingId - which is not yet defined, 
as TOPOLOGY_INIT is just being handled concurrently
 * CheckTopologyTask.fullRun(), as part of TOPOLOGY_INIT handling, however does 
not yet find the above new job in unassigned, as the job is just being stored 
concurrently.

The result is a job in the unassigned subtree, which waits until the next 
TopologyEvent happens - which then invokes CheckTopologyTask.fullRun() - which 
then finds the unassigned job and re/assigns it accordingly. So the job is 
never lost, but substantially delayed due to this. (the frequency of 
TopologyEvents depends on actual cluster/property changes happening in the 
topology and can thus vary).

Tasks:
* provide a test case to reproduce
* fix the race-condition
* undo 


> Suspected race condition between TOPOLOGY_INIT and JobManager.addJob
> 
>
> Key: SLING-12078
> URL: https://issues.apache.org/jira/browse/SLING-12078
> Project: Sling
>  Issue Type: Bug
>  Components: Event
>Affects Versions: Event 4.3.12
>    Reporter: Stefan Egli
>Priority: Major
>
> Two regular cases where a job is stored as part of JobManager.addJob():
>  * when a topology is defined, it directly gets stored to the appropriate 
> assigned/target slingId subtree. This is the most frequent case by far.
>  * if no topology is defined (no TOPOLOGY_INIT received) yet, it gets put 
> into the unassigned subtree. Later upon receiving TOPOLOGY_INIT 
> CheckTopologyTask.fullRun() finds such unassigned jobs and moves them to the 
> corresponding assigned subtree.
> There is a suspect race condition (test case to be provided), which happens 
> between the thread doing JobManager.addJob() and the thread handling the 
> TOPOLOGY_INIT:
>  * JobManager.addJob determines the target slingId - which is not yet 
> defined, as TOPOLOGY_INIT is just being handled concurrently
>  * CheckTopologyTask.fullRun(), as part of TOPOLOGY_INIT handling, however 
> does not yet find the above new job in unassigned, as the job is just being 
> stored concurrently.
> The result is a job in the unassigned subtree, which waits until the next 
> TopologyEvent happens - which then invokes CheckTopologyTask.fullRun() - 
> which then finds the unassigned job and re/assigns it accordingly. So the job 
> is never lost, but substantially delayed due to this. (the frequency of 
> Top

[jira] [Updated] (SLING-12078) Suspected race condition between TOPOLOGY_INIT and JobManager.addJob

2023-10-05 Thread Stefan Egli (Jira)


 [ 
https://issues.apache.org/jira/browse/SLING-12078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli updated SLING-12078:

Description: 
Two regular cases where a job is stored as part of JobManager.addJob():
 * when a topology is defined, it directly gets stored to the appropriate 
assigned/target slingId subtree. This is the most frequent case by far.
 * if no topology is defined (no TOPOLOGY_INIT received) yet, it gets put into 
the unassigned subtree. Later upon receiving TOPOLOGY_INIT 
CheckTopologyTask.fullRun() finds such unassigned jobs and moves them to the 
corresponding assigned subtree.

There is a suspect race condition (test case to be provided), which happens 
between the thread doing JobManager.addJob() and the thread handling the 
TOPOLOGY_INIT:
 * JobManager.addJob determines the target slingId - which is not yet defined, 
as TOPOLOGY_INIT is just being handled concurrently
 * CheckTopologyTask.fullRun(), as part of TOPOLOGY_INIT handling, however does 
not yet find the above new job in unassigned, as the job is just being stored 
concurrently.

The result is a job in the unassigned subtree, which waits until the next 
TopologyEvent happens - which then invokes CheckTopologyTask.fullRun() - which 
then finds the unassigned job and re/assigns it accordingly. So the job is 
never lost, but substantially delayed due to this. (the frequency of 
TopologyEvents depends on actual cluster/property changes happening in the 
topology and can thus vary).

Tasks:
* provide a test case to reproduce
* fix the race-condition
* undo 

  was:
Two regular cases where a job is stored as part of JobManager.addJob():
 * when a topology is defined, it directly gets stored to the appropriate 
assigned/target slingId subtree. This is the most frequent case by far.
 * if no topology is defined (no TOPOLOGY_INIT received) yet, it gets put into 
the unassigned subtree. Later upon receiving TOPOLOGY_INIT 
CheckTopologyTask.fullRun() finds such unassigned jobs and moves them to the 
corresponding assigned subtree.

There is a suspect race condition (test case to be provided), which happens 
between the thread doing JobManager.addJob() and the thread handling the 
TOPOLOGY_INIT:
 * JobManager.addJob determines the target slingId - which is not yet defined, 
as TOPOLOGY_INIT is just being handled concurrently
 * CheckTopologyTask.fullRun(), as part of TOPOLOGY_INIT handling, however does 
not yet find the above new job in unassigned, as the job is just being stored 
concurrently.

The result is a job in the unassigned subtree, which waits until the next 
TopologyEvent happens - which then invokes CheckTopologyTask.fullRun() - which 
then finds the unassigned job and re/assigns it accordingly. So the job is 
never lost, but substantially delayed due to this. (the frequency of 
TopologyEvents depends on actual cluster/property changes happening in the 
topology and can thus vary)


> Suspected race condition between TOPOLOGY_INIT and JobManager.addJob
> 
>
> Key: SLING-12078
> URL: https://issues.apache.org/jira/browse/SLING-12078
> Project: Sling
>  Issue Type: Bug
>  Components: Event
>Affects Versions: Event 4.3.12
>    Reporter: Stefan Egli
>Priority: Major
>
> Two regular cases where a job is stored as part of JobManager.addJob():
>  * when a topology is defined, it directly gets stored to the appropriate 
> assigned/target slingId subtree. This is the most frequent case by far.
>  * if no topology is defined (no TOPOLOGY_INIT received) yet, it gets put 
> into the unassigned subtree. Later upon receiving TOPOLOGY_INIT 
> CheckTopologyTask.fullRun() finds such unassigned jobs and moves them to the 
> corresponding assigned subtree.
> There is a suspect race condition (test case to be provided), which happens 
> between the thread doing JobManager.addJob() and the thread handling the 
> TOPOLOGY_INIT:
>  * JobManager.addJob determines the target slingId - which is not yet 
> defined, as TOPOLOGY_INIT is just being handled concurrently
>  * CheckTopologyTask.fullRun(), as part of TOPOLOGY_INIT handling, however 
> does not yet find the above new job in unassigned, as the job is just being 
> stored concurrently.
> The result is a job in the unassigned subtree, which waits until the next 
> TopologyEvent happens - which then invokes CheckTopologyTask.fullRun() - 
> which then finds the unassigned job and re/assigns it accordingly. So the job 
> is never lost, but substantially delayed due to this. (the frequency of 
> TopologyEvents depends on actual cluster/property changes happening in the 
> topology and can thus vary).
> Tasks:
> * provide a test case to reproduce
> * fix the race-condition
> * undo 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (SLING-12078) Suspected race condition between TOPOLOGY_INIT and JobManager.addJob

2023-10-05 Thread Stefan Egli (Jira)


[ 
https://issues.apache.org/jira/browse/SLING-12078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17772242#comment-17772242
 ] 

Stefan Egli commented on SLING-12078:
-

* added a [workaround 
attempt|https://github.com/apache/sling-org-apache-sling-event/commit/d16686705908099b26d0a3233f61c4e209880f93]
 for 2 in/frequently failing tests - that must be reverted as part of this 
ticket, once the race-condition is confirmed/fixed

> Suspected race condition between TOPOLOGY_INIT and JobManager.addJob
> 
>
> Key: SLING-12078
> URL: https://issues.apache.org/jira/browse/SLING-12078
> Project: Sling
>  Issue Type: Bug
>  Components: Event
>Affects Versions: Event 4.3.12
>Reporter: Stefan Egli
>Priority: Major
>
> Two regular cases where a job is stored as part of JobManager.addJob():
>  * when a topology is defined, it directly gets stored to the appropriate 
> assigned/target slingId subtree. This is the most frequent case by far.
>  * if no topology is defined (no TOPOLOGY_INIT received) yet, it gets put 
> into the unassigned subtree. Later upon receiving TOPOLOGY_INIT 
> CheckTopologyTask.fullRun() finds such unassigned jobs and moves them to the 
> corresponding assigned subtree.
> There is a suspect race condition (test case to be provided), which happens 
> between the thread doing JobManager.addJob() and the thread handling the 
> TOPOLOGY_INIT:
>  * JobManager.addJob determines the target slingId - which is not yet 
> defined, as TOPOLOGY_INIT is just being handled concurrently
>  * CheckTopologyTask.fullRun(), as part of TOPOLOGY_INIT handling, however 
> does not yet find the above new job in unassigned, as the job is just being 
> stored concurrently.
> The result is a job in the unassigned subtree, which waits until the next 
> TopologyEvent happens - which then invokes CheckTopologyTask.fullRun() - 
> which then finds the unassigned job and re/assigns it accordingly. So the job 
> is never lost, but substantially delayed due to this. (the frequency of 
> TopologyEvents depends on actual cluster/property changes happening in the 
> topology and can thus vary)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (SLING-12078) Suspected race condition between TOPOLOGY_INIT and JobManager.addJob

2023-10-05 Thread Stefan Egli (Jira)
Stefan Egli created SLING-12078:
---

 Summary: Suspected race condition between TOPOLOGY_INIT and 
JobManager.addJob
 Key: SLING-12078
 URL: https://issues.apache.org/jira/browse/SLING-12078
 Project: Sling
  Issue Type: Bug
  Components: Event
Affects Versions: Event 4.3.12
Reporter: Stefan Egli


Two regular cases where a job is stored as part of JobManager.addJob():
 * when a topology is defined, it directly gets stored to the appropriate 
assigned/target slingId subtree. This is the most frequent case by far.
 * if no topology is defined (no TOPOLOGY_INIT received) yet, it gets put into 
the unassigned subtree. Later upon receiving TOPOLOGY_INIT 
CheckTopologyTask.fullRun() finds such unassigned jobs and moves them to the 
corresponding assigned subtree.

There is a suspect race condition (test case to be provided), which happens 
between the thread doing JobManager.addJob() and the thread handling the 
TOPOLOGY_INIT:
 * JobManager.addJob determines the target slingId - which is not yet defined, 
as TOPOLOGY_INIT is just being handled concurrently
 * CheckTopologyTask.fullRun(), as part of TOPOLOGY_INIT handling, however does 
not yet find the above new job in unassigned, as the job is just being stored 
concurrently.

The result is a job in the unassigned subtree, which waits until the next 
TopologyEvent happens - which then invokes CheckTopologyTask.fullRun() - which 
then finds the unassigned job and re/assigns it accordingly. So the job is 
never lost, but substantially delayed due to this. (the frequency of 
TopologyEvents depends on actual cluster/property changes happening in the 
topology and can thus vary)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (SLING-11662) Endless loop in QuartzSchedulerThread.run() with maxPoolSize == queueSize

2023-10-02 Thread Stefan Egli (Jira)


[ 
https://issues.apache.org/jira/browse/SLING-11662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17771066#comment-17771066
 ] 

Stefan Egli commented on SLING-11662:
-

[~cziegeler], thx for reactivating this - got lost in the noise indeed. I'll 
have a look at the PR.

> Endless loop in QuartzSchedulerThread.run() with maxPoolSize == queueSize
> -
>
> Key: SLING-11662
> URL: https://issues.apache.org/jira/browse/SLING-11662
> Project: Sling
>  Issue Type: Bug
>  Components: Commons
>Affects Versions: Commons Scheduler 2.7.12
>Reporter: Stefan Egli
>Assignee: Stefan Egli
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When configuring the ThreadPool with maxPoolSize == queueSize and endless 
> loop (can) happen(s) in QuartzSchedulerThread.run() which manifests as 
> follows:
> {noformat}
> "MyPool_QuartzSchedulerThread" #123 prio=5 os_prio=0 cpu=5123456.78ms 
> elapsed=5163.45s tid=0x12345678ff00 nid=0x1234 runnable  
> [0x87654321ff00]
>java.lang.Thread.State: RUNNABLE
> at 
> org.quartz.core.QuartzSchedulerThread.run(QuartzSchedulerThread.java:413)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (SLING-11894) jcr-contentloader: Fix paxexam Integration Tests with Java 17

2023-08-17 Thread Stefan Egli (Jira)


 [ 
https://issues.apache.org/jira/browse/SLING-11894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli resolved SLING-11894.
-
Resolution: Fixed

> jcr-contentloader: Fix paxexam Integration Tests with Java 17
> -
>
> Key: SLING-11894
> URL: https://issues.apache.org/jira/browse/SLING-11894
> Project: Sling
>  Issue Type: Bug
>  Components: JCR
>Reporter: Stefan Seifert
>Assignee: Rishabh Daim
>Priority: Major
> Fix For: JCR ContentLoader 2.6.2
>
>
> currently, the integration tests for JCR contentloader are failing on both 
> linux and windows when running with Java 17.
> all ITs are failing with an error like this:
> {noformat}
> [INFO] Running org.apache.sling.jcr.contentloader.it.OrderedInitialContentIT
> [ERROR] Tests run: 2, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 
> 64.826 s <<< FAILURE! - in 
> org.apache.sling.jcr.contentloader.it.OrderedInitialContentIT
> [ERROR] 
> org.apache.sling.jcr.contentloader.it.OrderedInitialContentIT.initialContentInstalled
>   Time elapsed: 11.066 s  <<< ERROR!
> org.ops4j.pax.swissbox.tracker.ServiceLookupException: gave up waiting for 
> service org.apache.sling.resource.presence.ResourcePresence
> at 
> org.ops4j.pax.swissbox.tracker.ServiceLookup.getService(ServiceLookup.java:199)
> at 
> org.ops4j.pax.swissbox.tracker.ServiceLookup.getService(ServiceLookup.java:136)
> at 
> org.ops4j.pax.exam.inject.internal.ServiceInjector.injectField(ServiceInjector.java:89)
> at 
> org.ops4j.pax.exam.inject.internal.ServiceInjector.injectDeclaredFields(ServiceInjector.java:69)
> at 
> org.ops4j.pax.exam.inject.internal.ServiceInjector.injectFields(ServiceInjector.java:61)
> at 
> org.ops4j.pax.exam.invoker.junit.internal.ContainerTestRunner.createTest(ContainerTestRunner.java:68)
> at 
> org.junit.runners.BlockJUnit4ClassRunner$1.runReflectiveCall(BlockJUnit4ClassRunner.java:266)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.methodBlock(BlockJUnit4ClassRunner.java:263)
> at 
> org.ops4j.pax.exam.invoker.junit.internal.ContainerTestRunner.runChildWithRetry(ContainerTestRunner.java:84)
> at 
> org.ops4j.pax.exam.invoker.junit.internal.ContainerTestRunner.runChild(ContainerTestRunner.java:75)
> at 
> org.ops4j.pax.exam.invoker.junit.internal.ContainerTestRunner.runChild(ContainerTestRunner.java:43)
> at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
> at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
> at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
> at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
> at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
> at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
> at org.junit.runner.JUnitCore.run(JUnitCore.java:115)
> at 
> org.ops4j.pax.exam.invoker.junit.internal.JUnitProbeInvoker.invokeViaJUnit(JUnitProbeInvoker.java:124)
> at 
> org.ops4j.pax.exam.invoker.junit.internal.JUnitProbeInvoker.findAndInvoke(JUnitProbeInvoker.java:97)
> at 
> org.ops4j.pax.exam.invoker.junit.internal.JUnitProbeInvoker.call(JUnitProbeInvoker.java:73)
> at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
> at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.base/java.lang.reflect.Method.invoke(Method.java:568)
> at 
> org.ops4j.pax.swissbox.framework.RemoteFrameworkImpl.invokeMethodOnService(RemoteFrameworkImpl.java:435)
> at 
> org.ops4j.pax.swissbox.framework.RemoteFrameworkImpl.invokeMethodOnService(RemoteFrameworkImpl.java:408)
> at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
> at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.base/java.lang.reflect.Method.invoke(Method.java:568)
> at 
> java.rmi/sun.rmi.server.U

[jira] [Assigned] (SLING-11894) jcr-contentloader: Fix paxexam Integration Tests with Java 17

2023-08-17 Thread Stefan Egli (Jira)


 [ 
https://issues.apache.org/jira/browse/SLING-11894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli reassigned SLING-11894:
---

Assignee: Rishabh Daim

> jcr-contentloader: Fix paxexam Integration Tests with Java 17
> -
>
> Key: SLING-11894
> URL: https://issues.apache.org/jira/browse/SLING-11894
> Project: Sling
>  Issue Type: Bug
>  Components: JCR
>Reporter: Stefan Seifert
>Assignee: Rishabh Daim
>Priority: Major
> Fix For: JCR ContentLoader 2.6.2
>
>
> currently, the integration tests for JCR contentloader are failing on both 
> linux and windows when running with Java 17.
> all ITs are failing with an error like this:
> {noformat}
> [INFO] Running org.apache.sling.jcr.contentloader.it.OrderedInitialContentIT
> [ERROR] Tests run: 2, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 
> 64.826 s <<< FAILURE! - in 
> org.apache.sling.jcr.contentloader.it.OrderedInitialContentIT
> [ERROR] 
> org.apache.sling.jcr.contentloader.it.OrderedInitialContentIT.initialContentInstalled
>   Time elapsed: 11.066 s  <<< ERROR!
> org.ops4j.pax.swissbox.tracker.ServiceLookupException: gave up waiting for 
> service org.apache.sling.resource.presence.ResourcePresence
> at 
> org.ops4j.pax.swissbox.tracker.ServiceLookup.getService(ServiceLookup.java:199)
> at 
> org.ops4j.pax.swissbox.tracker.ServiceLookup.getService(ServiceLookup.java:136)
> at 
> org.ops4j.pax.exam.inject.internal.ServiceInjector.injectField(ServiceInjector.java:89)
> at 
> org.ops4j.pax.exam.inject.internal.ServiceInjector.injectDeclaredFields(ServiceInjector.java:69)
> at 
> org.ops4j.pax.exam.inject.internal.ServiceInjector.injectFields(ServiceInjector.java:61)
> at 
> org.ops4j.pax.exam.invoker.junit.internal.ContainerTestRunner.createTest(ContainerTestRunner.java:68)
> at 
> org.junit.runners.BlockJUnit4ClassRunner$1.runReflectiveCall(BlockJUnit4ClassRunner.java:266)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.methodBlock(BlockJUnit4ClassRunner.java:263)
> at 
> org.ops4j.pax.exam.invoker.junit.internal.ContainerTestRunner.runChildWithRetry(ContainerTestRunner.java:84)
> at 
> org.ops4j.pax.exam.invoker.junit.internal.ContainerTestRunner.runChild(ContainerTestRunner.java:75)
> at 
> org.ops4j.pax.exam.invoker.junit.internal.ContainerTestRunner.runChild(ContainerTestRunner.java:43)
> at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
> at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
> at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
> at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
> at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
> at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
> at org.junit.runner.JUnitCore.run(JUnitCore.java:115)
> at 
> org.ops4j.pax.exam.invoker.junit.internal.JUnitProbeInvoker.invokeViaJUnit(JUnitProbeInvoker.java:124)
> at 
> org.ops4j.pax.exam.invoker.junit.internal.JUnitProbeInvoker.findAndInvoke(JUnitProbeInvoker.java:97)
> at 
> org.ops4j.pax.exam.invoker.junit.internal.JUnitProbeInvoker.call(JUnitProbeInvoker.java:73)
> at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
> at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.base/java.lang.reflect.Method.invoke(Method.java:568)
> at 
> org.ops4j.pax.swissbox.framework.RemoteFrameworkImpl.invokeMethodOnService(RemoteFrameworkImpl.java:435)
> at 
> org.ops4j.pax.swissbox.framework.RemoteFrameworkImpl.invokeMethodOnService(RemoteFrameworkImpl.java:408)
> at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
> at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.base/java.lang.reflect.Method.invoke(Method.java:568)
>

Re: [VOTE] Release Apache Sling Engine 2.15.4

2023-07-31 Thread Stefan Egli

+1,

Cheers,
Stefan

On 29.07.23 18:04, Carsten Ziegeler wrote:

Hi,

We solved 2 issues in this release:
https://issues.apache.org/jira/projects/SLING/versions/12353326

Staging repository:
https://repository.apache.org/content/repositories/orgapachesling-2775

You can use this UNIX script to download the release and verify the
signatures:
https://raw.githubusercontent.com/apache/sling-tooling-release/master/check_staged_release.sh

Usage:
sh check_staged_release.sh 2775 /tmp/sling-staging

Please vote to approve this release:

   [ ] +1 Approve the release
   [ ]  0 Don't care
   [ ] -1 Don't release, because ...

This majority vote is open for at least 72 hours.

Regards
Carsten


Re: [VOTE] Release Apache Sling Resource Merger 1.4.4

2023-07-31 Thread Stefan Egli

+1,

Cheers,
Stefan

On 24.07.23 13:30, Carsten Ziegeler wrote:

Hi,

We solved 1 issue in this release:
https://issues.apache.org/jira/browse/SLING-11978

Staging repository:
https://repository.apache.org/content/repositories/orgapachesling-2774

You can use this UNIX script to download the release and verify the
signatures:
https://raw.githubusercontent.com/apache/sling-tooling-release/master/check_staged_release.sh

Usage:
sh check_staged_release.sh 2774 /tmp/sling-staging

Please vote to approve this release:

   [ ] +1 Approve the release
   [ ]  0 Don't care
   [ ] -1 Don't release, because ...

This majority vote is open for at least 72 hours.

Regards
Carsten


[jira] [Commented] (SLING-11923) Sling Events does not Build on Java 17

2023-07-18 Thread Stefan Egli (Jira)


[ 
https://issues.apache.org/jira/browse/SLING-11923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17744276#comment-17744276
 ] 

Stefan Egli commented on SLING-11923:
-

FYI: merged https://github.com/apache/sling-org-apache-sling-event/pull/32

> Sling Events does not Build on Java 17
> --
>
> Key: SLING-11923
> URL: https://issues.apache.org/jira/browse/SLING-11923
> Project: Sling
>  Issue Type: Bug
>  Components: Event
>Affects Versions: Event 4.3.12
>Reporter: Dan Klco
>Assignee: Rishabh Daim
>Priority: Major
> Fix For: Event 4.3.14
>
>
> Attempting to build Sling Events with Java 17 fails with:
> {code:java}
> [main] INFO org.apache.jackrabbit.oak.plugins.index.IndexUpdate - Reindexing 
> completed
> [ERROR] Tests run: 4, Failures: 0, Errors: 4, Skipped: 0, Time elapsed: 1.935 
> s <<< FAILURE! - in org.apache.sling.event.impl.jobs.queues.TestTopicHalting
> [ERROR] 
> org.apache.sling.event.impl.jobs.queues.TestTopicHalting.testUnhalting  Time 
> elapsed: 1.506 s  <<< ERROR!
> java.lang.NoClassDefFoundError: java/security/acl/Group
>   at java.base/java.lang.ClassLoader.defineClass1(Native Method)
>   at java.base/java.lang.ClassLoader.defineClass(ClassLoader.java:1012)
>   at 
> java.base/java.security.SecureClassLoader.defineClass(SecureClassLoader.java:150)
>   at 
> java.base/jdk.internal.loader.BuiltinClassLoader.defineClass(BuiltinClassLoader.java:862)
>   at 
> java.base/jdk.internal.loader.BuiltinClassLoader.findClassOnClassPathOrNull(BuiltinClassLoader.java:760)
>   at 
> java.base/jdk.internal.loader.BuiltinClassLoader.loadClassOrNull(BuiltinClassLoader.java:681)
>   at 
> java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:639)
>  
> {code}
> This class is deprecated for removal in Java 11: 
> https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/security/acl/Group.html



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (SLING-11923) Sling Events does not work on Java 17

2023-07-13 Thread Stefan Egli (Jira)


 [ 
https://issues.apache.org/jira/browse/SLING-11923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli reassigned SLING-11923:
---

Assignee: Rishabh Daim

> Sling Events does not work on Java 17
> -
>
> Key: SLING-11923
> URL: https://issues.apache.org/jira/browse/SLING-11923
> Project: Sling
>  Issue Type: Bug
>  Components: Event
>Affects Versions: Event 4.3.12
>Reporter: Dan Klco
>Assignee: Rishabh Daim
>Priority: Major
>
> Attempting to build Sling Events with Java 17 fails with:
> {code:java}
> [main] INFO org.apache.jackrabbit.oak.plugins.index.IndexUpdate - Reindexing 
> completed
> [ERROR] Tests run: 4, Failures: 0, Errors: 4, Skipped: 0, Time elapsed: 1.935 
> s <<< FAILURE! - in org.apache.sling.event.impl.jobs.queues.TestTopicHalting
> [ERROR] 
> org.apache.sling.event.impl.jobs.queues.TestTopicHalting.testUnhalting  Time 
> elapsed: 1.506 s  <<< ERROR!
> java.lang.NoClassDefFoundError: java/security/acl/Group
>   at java.base/java.lang.ClassLoader.defineClass1(Native Method)
>   at java.base/java.lang.ClassLoader.defineClass(ClassLoader.java:1012)
>   at 
> java.base/java.security.SecureClassLoader.defineClass(SecureClassLoader.java:150)
>   at 
> java.base/jdk.internal.loader.BuiltinClassLoader.defineClass(BuiltinClassLoader.java:862)
>   at 
> java.base/jdk.internal.loader.BuiltinClassLoader.findClassOnClassPathOrNull(BuiltinClassLoader.java:760)
>   at 
> java.base/jdk.internal.loader.BuiltinClassLoader.loadClassOrNull(BuiltinClassLoader.java:681)
>   at 
> java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:639)
>  
> {code}
> This class is deprecated for removal in Java 11: 
> https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/security/acl/Group.html



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (SLING-11918) GaugeSupport has infinite recursion in registerWithSuffix

2023-07-11 Thread Stefan Egli (Jira)


 [ 
https://issues.apache.org/jira/browse/SLING-11918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli resolved SLING-11918.
-
Resolution: Fixed

Merged [PR|https://github.com/apache/sling-org-apache-sling-event/pull/31], thx 
[~patlego] !

> GaugeSupport has infinite recursion in registerWithSuffix
> -
>
> Key: SLING-11918
> URL: https://issues.apache.org/jira/browse/SLING-11918
> Project: Sling
>  Issue Type: Bug
>  Components: Event
>Affects Versions: Event 4.3.8
>Reporter: Patrique Legault
>Priority: Critical
> Fix For: Event 4.3.14
>
>
> This exception occurs on a system with an unknown but particular 
> configuration but none the less causes the system to become unusable.
>  
> {code:java}
> (java.lang.StackOverflowError: Delayed StackOverflowError due to  
> ReservedStackAccess annotated method)
>     at 
> java.base/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1239)
>     at 
> java.base/java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:959)
>     at 
> java.management/com.sun.jmx.mbeanserver.Repository.addMBean(Repository.java:415)
>     at 
> java.management/com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerWithRepository(DefaultMBeanServerInterceptor.java:1855)
>     at 
> java.management/com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerDynamicMBean(DefaultMBeanServerInterceptor.java:955)
>     at 
> java.management/com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:890)
>     at 
> java.management/com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:320)
>     at 
> java.management/com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522)
>     at 
> com.codahale.metrics.JmxReporter$JmxListener.registerMBean(JmxReporter.java:510)
>  [io.dropwizard.metrics.core:3.2.4]
>     at 
> com.codahale.metrics.JmxReporter$JmxListener.onGaugeAdded(JmxReporter.java:535)
>  [io.dropwizard.metrics.core:3.2.4]
>     at 
> com.codahale.metrics.MetricRegistry.notifyListenerOfAddedMetric(MetricRegistry.java:454)
>  [io.dropwizard.metrics.core:3.2.4]
>     at 
> com.codahale.metrics.MetricRegistry.onMetricAdded(MetricRegistry.java:448) 
> [io.dropwizard.metrics.core:3.2.4]
>     at com.codahale.metrics.MetricRegistry.register(MetricRegistry.java:89) 
> [io.dropwizard.metrics.core:3.2.4]
>     at 
> org.apache.sling.event.impl.jobs.stats.GaugeSupport.registerWithSuffix(GaugeSupport.java:150)
>  [org.apache.sling.event:4.3.8]
>     at 
> org.apache.sling.event.impl.jobs.stats.GaugeSupport.registerWithSuffix(GaugeSupport.java:154)
>  [org.apache.sling.event:4.3.8] {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (SLING-11901) Extend job metrics

2023-06-05 Thread Stefan Egli (Jira)
Stefan Egli created SLING-11901:
---

 Summary: Extend job metrics
 Key: SLING-11901
 URL: https://issues.apache.org/jira/browse/SLING-11901
 Project: Sling
  Issue Type: Task
  Components: Event
Affects Versions: Event 4.3.12
Reporter: Stefan Egli


Below is a list of additional metrics to add to sling.event on top of what 
SLING-8665 already added earlier:

* a gauge for number of configured queues
* a gauge for number of queues that have currently queued jobs
* a gauge for number of queues that have currently running jobs
* a per-queue histogram of waiting time of jobs (current "averageWaitingTime" 
is total average only)
* a per-queue histogram of durations of ongoing jobs
* a per-queue histogram of durations of finished jobs (current 
"averageProcessingTime" is total average only)
* a per-queue gauge for number of job retries (as far as available)

Below metric is potentially controversial, as it is implementation specific:
* a per-queue gauge for number of job reassignments




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (SLING-11797) Log Jobs Added with No Assigned Topology Capability at Info

2023-04-03 Thread Stefan Egli (Jira)


 [ 
https://issues.apache.org/jira/browse/SLING-11797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli closed SLING-11797.
---

> Log Jobs Added with No Assigned Topology Capability at Info
> ---
>
> Key: SLING-11797
> URL: https://issues.apache.org/jira/browse/SLING-11797
> Project: Sling
>  Issue Type: Bug
>  Components: Event
>Affects Versions: Event 4.3.6
>Reporter: Dan Klco
>Assignee: Dan Klco
>Priority: Minor
> Fix For: Event 4.3.8
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> When creating a job where the topology does not provide a capability for the 
> topic, the JobManagerImpl logs the following message at the DEBUG level:
> {quote}Persisting job {} into queue {}{quote}
>  
> This makes it challenging to identify/diagnose issues with jobs not being 
> assigned as: * It requires enabling debug logging on the JobManagerImpl which 
> can be quite verbose, especially under load
>  * Since most production instances do not run with DEBUG, these situations 
> will not be available in logs
>  * The log message does not indicate that this job will not be immediately 
> assigned to be processed
> Instead, the JobManagerImpl should log a message at least at INFO level which 
> indicates that the Job being persisted does not have an assigned target.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (SLING-11793) Limit log messages via JobExecutionContext.log()

2023-04-03 Thread Stefan Egli (Jira)


 [ 
https://issues.apache.org/jira/browse/SLING-11793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli closed SLING-11793.
---

> Limit log messages via JobExecutionContext.log()
> 
>
> Key: SLING-11793
> URL: https://issues.apache.org/jira/browse/SLING-11793
> Project: Sling
>  Issue Type: Improvement
>  Components: Event
>Reporter: Rishabh Kumar
>Priority: Major
> Fix For: Event 4.3.8
>
>  Time Spent: 5h
>  Remaining Estimate: 0h
>
> Currently, every log message passed via JobExecutionContext.log() is appended 
> to previous messages and then stored in the repository. This can bloat the 
> repository and is discouraged as described in JavaDoc:
> {quote}A job consumer can use this method during job processing to add 
> additional information about the current state of job processing. As calling 
> this method adds a significant overhead it should only be used to log a few 
> statements per job processing. If a consumer wants to output detailed 
> information about the processing it should persists it by itself and not use 
> this method for it. The message and the arguments are passed to the 
> MessageFormat class.{quote}
> Some job implementations ignore this advice and still log potentially many 
> messages during execution.
> {color:#172b4d}The Sling Job implementation should ignore further log 
> messages when a threshold is reached. This may be configurable to make it 
> backward compatible{color}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (SLING-11805) Don't stop slingId cleanup upon PROPERTIES_CHANGED

2023-04-03 Thread Stefan Egli (Jira)


 [ 
https://issues.apache.org/jira/browse/SLING-11805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli closed SLING-11805.
---

> Don't stop slingId cleanup upon PROPERTIES_CHANGED
> --
>
> Key: SLING-11805
> URL: https://issues.apache.org/jira/browse/SLING-11805
> Project: Sling
>  Issue Type: Improvement
>  Components: Discovery
>Affects Versions: Discovery Oak 1.2.40
>Reporter: Stefan Egli
>Assignee: Stefan Egli
>Priority: Minor
> Fix For: Discovery Oak 1.2.44
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> As a follow-up to SLING-10854 where the SlingIdCleanupTask was introduced. 
> The current implementation stops cleanup when it received a 
> PROPERTIES_CHANGED event. This is actually wrong. It should continue. The way 
> it is currently done has the effect that cleanup is only triggered upon a 
> TOPOLOGY_INIT or TOPOLOGY_CHANGED without a following PROPERTIES_CHANGED. 
> This current behaviour reduces the chances of the cleanup running - having 
> said that, the likelyhood of the cleanup eventually running is still very 
> high.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (SLING-10854) Introduce cleanup job of old slingId data in discovery

2023-04-03 Thread Stefan Egli (Jira)


 [ 
https://issues.apache.org/jira/browse/SLING-10854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli closed SLING-10854.
---

> Introduce cleanup job of old slingId data in discovery
> --
>
> Key: SLING-10854
> URL: https://issues.apache.org/jira/browse/SLING-10854
> Project: Sling
>  Issue Type: Improvement
>  Components: Discovery
>Affects Versions: Discovery Oak 1.2.34
>    Reporter: Stefan Egli
>Assignee: Stefan Egli
>Priority: Major
> Fix For: Discovery Oak 1.2.44
>
>  Time Spent: 11h 10m
>  Remaining Estimate: 0h
>
> Discovery.oak stores nodes and properties per slingId under 
> {{/var/discovery/oak}}. In a scenario where the slingIds are stable things 
> are fine. If the slingIds change frequently, old slingId-related data stays 
> as garbage and accumulates.
> We should introduce a cleanup job to delete old slingId data. The leader 
> could execute this to avoid race conditions. We might need to add some 
> additional property to indicate age of slingIds (there's already the 
> {{/var/discovery/oak/clusterInstances/leaderElectionIdCreatedAt}} property 
> which gets updated upon each discovery.oak bundle activation - but it's 
> somewhat indirect. Having a new, dedicated property sounds cleaner (this one 
> could be used to clean up old data though)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[RESULT] [VOTE] Release Apache Sling Oak-Based Discovery Service 1.2.44 AND Apache Sling Event 4.3.8

2023-04-03 Thread Stefan Egli

Hi,

The vote has passed with the following result :

+1 (binding): Stefan Seifert, Daniel Klco, Radu Cotescu
+1 (non binding): none

I will copy this release to the Sling dist directory and
promote the artifacts to the central Maven repository.

Cheers,
Stefan
--

On 27.03.23 18:21, Stefan Egli wrote:

Hi,


This vote is about 2 parts:



[part 1]
Apache Sling Oak-Based Discovery Service 1.2.44 :

We solved 2 issues in this release:
https://issues.apache.org/jira/browse/SLING/fixforversion/12352471

There are still some outstanding issues:
https://issues.apache.org/jira/projects/SLING/versions/12353050



[part 2]
Apache Sling Event 4.3.8 :

We solved 2 issues in this release:
https://issues.apache.org/jira/browse/SLING/fixforversion/12351879

There are still some outstanding issues:
https://issues.apache.org/jira/projects/SLING/versions/12353051



Staging repository:
https://repository.apache.org/content/repositories/orgapachesling-2729/

You can use this UNIX script to download the release and verify the signatures:
https://gitbox.apache.org/repos/asf?p=sling-tooling-release.git;a=blob;f=check_staged_release.sh;hb=HEAD

Usage:
sh check_staged_release.sh 2729 /tmp/sling-staging

Please vote to approve this release:

   [ ] +1 Approve the release
   [ ]  0 Don't care
   [ ] -1 Don't release, because ...

This majority vote is open for at least 72 hours.

Cheers,
Stefan


[Vote] Release Apache Sling Oak-Based Discovery Service 1.2.44 AND Apache Sling Event 4.3.8

2023-03-27 Thread Stefan Egli

Hi,


This vote is about 2 parts:



[part 1]
Apache Sling Oak-Based Discovery Service 1.2.44 :

We solved 2 issues in this release:
https://issues.apache.org/jira/browse/SLING/fixforversion/12352471

There are still some outstanding issues:
https://issues.apache.org/jira/projects/SLING/versions/12353050



[part 2]
Apache Sling Event 4.3.8 :

We solved 2 issues in this release:
https://issues.apache.org/jira/browse/SLING/fixforversion/12351879

There are still some outstanding issues:
https://issues.apache.org/jira/projects/SLING/versions/12353051



Staging repository:
https://repository.apache.org/content/repositories/orgapachesling-2729/

You can use this UNIX script to download the release and verify the signatures:
https://gitbox.apache.org/repos/asf?p=sling-tooling-release.git;a=blob;f=check_staged_release.sh;hb=HEAD

Usage:
sh check_staged_release.sh 2729 /tmp/sling-staging

Please vote to approve this release:

  [ ] +1 Approve the release
  [ ]  0 Don't care
  [ ] -1 Don't release, because ...

This majority vote is open for at least 72 hours.

Cheers,
Stefan


[jira] [Updated] (SLING-11422) Stop embedding the event.api package in the event bundle

2023-03-27 Thread Stefan Egli (Jira)


 [ 
https://issues.apache.org/jira/browse/SLING-11422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli updated SLING-11422:

Fix Version/s: Event 4.3.10
   (was: Event 4.3.8)

> Stop embedding the event.api package in the event bundle
> 
>
> Key: SLING-11422
> URL: https://issues.apache.org/jira/browse/SLING-11422
> Project: Sling
>  Issue Type: Improvement
>  Components: Event
>Reporter: Robert Munteanu
>Priority: Major
> Fix For: Event 4.3.10
>
>
> As discussed in SLING-9664, deploying the Sling Event and Event API bundles 
> separately would be more in line with how we deploy bundles and also fix the 
> Javadoc generation.
> We should make this a minor version bump for the event bundle, to make it 
> clear that deployers need to adapt. Probably the baselining mechanism will 
> complain, but it's something we can ignore for the release.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (SLING-9664) org.apache.sling.event.jobs package not present in javadoc for sling10+

2023-03-27 Thread Stefan Egli (Jira)


 [ 
https://issues.apache.org/jira/browse/SLING-9664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli updated SLING-9664:
---
Fix Version/s: Event 4.3.10
   (was: Event 4.3.8)

> org.apache.sling.event.jobs package not present in javadoc for sling10+
> ---
>
> Key: SLING-9664
> URL: https://issues.apache.org/jira/browse/SLING-9664
> Project: Sling
>  Issue Type: Improvement
>  Components: Event
>Reporter: Joerg Hoh
>Priority: Major
> Fix For: Event 4.3.10
>
>
> While the javadoc for sling9 [1] cover the org.apache.sling.event.jobs 
> package(s), they went missing with the sling10 javadoc [2] and subsequent 
> versions.
> [1] https://sling.apache.org/apidocs/sling9/index.html
> [2] https://sling.apache.org/apidocs/sling10/index.html



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (SLING-11805) Don't stop slingId cleanup upon PROPERTIES_CHANGED

2023-03-27 Thread Stefan Egli (Jira)


 [ 
https://issues.apache.org/jira/browse/SLING-11805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli resolved SLING-11805.
-
Resolution: Fixed

> Don't stop slingId cleanup upon PROPERTIES_CHANGED
> --
>
> Key: SLING-11805
> URL: https://issues.apache.org/jira/browse/SLING-11805
> Project: Sling
>  Issue Type: Improvement
>  Components: Discovery
>Affects Versions: Discovery Oak 1.2.40
>Reporter: Stefan Egli
>Assignee: Stefan Egli
>Priority: Minor
> Fix For: Discovery Oak 1.2.44
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> As a follow-up to SLING-10854 where the SlingIdCleanupTask was introduced. 
> The current implementation stops cleanup when it received a 
> PROPERTIES_CHANGED event. This is actually wrong. It should continue. The way 
> it is currently done has the effect that cleanup is only triggered upon a 
> TOPOLOGY_INIT or TOPOLOGY_CHANGED without a following PROPERTIES_CHANGED. 
> This current behaviour reduces the chances of the cleanup running - having 
> said that, the likelyhood of the cleanup eventually running is still very 
> high.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (SLING-9625) DiscoveryServiceImpl#doUpdateProperties may fail due to a LoginException

2023-03-27 Thread Stefan Egli (Jira)


 [ 
https://issues.apache.org/jira/browse/SLING-9625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli updated SLING-9625:
---
Fix Version/s: Discovery Oak 1.2.46
   (was: Discovery Oak 1.2.44)

> DiscoveryServiceImpl#doUpdateProperties may fail due to a LoginException 
> -
>
> Key: SLING-9625
> URL: https://issues.apache.org/jira/browse/SLING-9625
> Project: Sling
>  Issue Type: Improvement
>Affects Versions: Discovery Oak 1.2.30
>Reporter: Konrad Windszus
>Priority: Major
> Fix For: Discovery Oak 1.2.46
>
>
> While stopping the OSGi container (Sling Starter 12 SNAPSHOT) I observed the 
> following error
> {code}
> 03.08.2020 10:30:06.262 *INFO * [Apache Sling Terminator] Stopping Apache 
> Sling
> ERROR: bundle org.apache.sling.discovery.oak:1.2.28 
> (139)[org.apache.sling.discovery.oak.OakDiscoveryService(200)] : The 
> updatedPropertyProvider method has thrown an exception
> java.lang.RuntimeException: Could not log in to repository 
> (org.apache.sling.api.resource.LoginException: Cannot derive user name for 
> bundle org.apache.sling.discovery.oak [139] and sub service null)
>   at 
> org.apache.sling.discovery.oak.OakDiscoveryService.doUpdateProperties(OakDiscoveryService.java:540)
>   at 
> org.apache.sling.discovery.oak.OakDiscoveryService.bindPropertyProviderInteral(OakDiscoveryService.java:406)
>   at 
> org.apache.sling.discovery.oak.OakDiscoveryService.updatedPropertyProvider(OakDiscoveryService.java:421)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.felix.scr.impl.inject.methods.BaseMethod.invokeMethod(BaseMethod.java:242)
>   at 
> org.apache.felix.scr.impl.inject.methods.BaseMethod.access$500(BaseMethod.java:41)
>   at 
> org.apache.felix.scr.impl.inject.methods.BaseMethod$Resolved.invoke(BaseMethod.java:678)
>   at 
> org.apache.felix.scr.impl.inject.methods.BaseMethod$NotResolved.invoke(BaseMethod.java:633)
>   at 
> org.apache.felix.scr.impl.inject.methods.BaseMethod.invoke(BaseMethod.java:524)
>   at 
> org.apache.felix.scr.impl.inject.methods.BindMethod.invoke(BindMethod.java:42)
>   at 
> org.apache.felix.scr.impl.manager.DependencyManager.invokeUpdatedMethod(DependencyManager.java:1934)
>   at 
> org.apache.felix.scr.impl.manager.SingleComponentManager.invokeUpdatedMethod(SingleComponentManager.java:448)
>   at 
> org.apache.felix.scr.impl.manager.DependencyManager$MultipleDynamicCustomizer.modifiedService(DependencyManager.java:366)
>   at 
> org.apache.felix.scr.impl.manager.DependencyManager$MultipleDynamicCustomizer.modifiedService(DependencyManager.java:297)
>   at 
> org.apache.felix.scr.impl.manager.ServiceTracker$Tracked.customizerModified(ServiceTracker.java:1229)
>   at 
> org.apache.felix.scr.impl.manager.ServiceTracker$Tracked.customizerModified(ServiceTracker.java:1137)
>   at 
> org.apache.felix.scr.impl.manager.ServiceTracker$AbstractTracked.track(ServiceTracker.java:883)
>   at 
> org.apache.felix.scr.impl.manager.ServiceTracker$Tracked.serviceChanged(ServiceTracker.java:1168)
>   at 
> org.apache.felix.scr.impl.BundleComponentActivator$ListenerInfo.serviceChanged(BundleComponentActivator.java:125)
>   at 
> org.apache.felix.framework.EventDispatcher.invokeServiceListenerCallback(EventDispatcher.java:990)
>   at 
> org.apache.felix.framework.EventDispatcher.fireEventImmediately(EventDispatcher.java:838)
>   at 
> org.apache.felix.framework.EventDispatcher.fireServiceEvent(EventDispatcher.java:545)
>   at org.apache.felix.framework.Felix.fireServiceEvent(Felix.java:4833)
>   at org.apache.felix.framework.Felix.access$000(Felix.java:112)
>   at org.apache.felix.framework.Felix$1.serviceChanged(Felix.java:434)
>   at 
> org.apache.felix.framework.ServiceRegistry.servicePropertiesModified(ServiceRegistry.java:601)
>   at 
> org.apache.felix.framework.ServiceRegistrationImpl.setProperties(ServiceRegistrationImpl.java:132)
>   at 
> org.apache.sling.event.impl.jobs.JobConsumerManager.unbindService(JobConsumerManager.java:354)
>   at 
> org.apache.sling.event.impl.jobs.JobConsumerManager.unbindJobExecutor(JobConsumerManager.java:270)
>   at sun.reflec

[jira] [Updated] (SLING-10813) Improve ViewStateManagerImpl.waitForAsyncEvents, also speeds up tests

2023-03-27 Thread Stefan Egli (Jira)


 [ 
https://issues.apache.org/jira/browse/SLING-10813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli updated SLING-10813:

Fix Version/s: Discovery Oak 1.2.46
   (was: Discovery Oak 1.2.44)

> Improve ViewStateManagerImpl.waitForAsyncEvents, also speeds up tests
> -
>
> Key: SLING-10813
> URL: https://issues.apache.org/jira/browse/SLING-10813
> Project: Sling
>  Issue Type: Improvement
>  Components: Discovery
>    Reporter: Stefan Egli
>Priority: Minor
> Fix For: Discovery Oak 1.2.46
>
>
> As discussed [in this 
> PR|https://github.com/apache/sling-org-apache-sling-discovery-oak/pull/4#discussion_r708292265]
>  the ViewStateManagerImpl.waitForAsyncEvents returning currently requires a 
> {{Thread.sleep()}} to ensure anything that was "just triggered" has finished 
> executing asynchronously.
> This should be improved in this waitForAsyncEvent method, by being more 
> precise about when it returns (ie include any call to 
> {{asyncEvent.trigger()}} having terminated)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (SLING-5598) Exclude slow tests by default with assume(sling.slow.tests.enabled)

2023-03-27 Thread Stefan Egli (Jira)


 [ 
https://issues.apache.org/jira/browse/SLING-5598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli updated SLING-5598:
---
Fix Version/s: Discovery Oak 1.2.46
   (was: Discovery Oak 1.2.44)

> Exclude slow tests by default with assume(sling.slow.tests.enabled) 
> 
>
> Key: SLING-5598
> URL: https://issues.apache.org/jira/browse/SLING-5598
> Project: Sling
>  Issue Type: Task
>  Components: Extensions
>Affects Versions: Discovery Impl 1.2.6, Discovery Base 1.1.2, Discovery 
> Commons 1.0.10, Discovery Oak 1.2.6
>Reporter: Stefan Egli
>Priority: Major
> Fix For: Discovery Impl 1.2.14, Discovery Base 2.0.16, Discovery 
> Commons 1.0.30, Discovery Oak 1.2.46
>
> Attachments: SLING-5598-commons-testing.patch, 
> SLING-5598-discovery.patch
>
>
> As suggested by [~bdelacretaz] on [the 
> list|http://markmail.org/message/yad5awqg53epk3ck] we should improve test 
> duration (ideally 1-2min per bundle max, 10-15min overall). While they are 
> not yet improved however, slow tests should be excluded by default and run 
> only if enabled explicitly. Here's an example {{@Before}} method to achieve 
> that:
> {noformat}
> @Before
> public void checkSlowTests() {
> assumeNotNull(System.getProperty("sling.slow.tests.enabled"));
> }
> {noformat}
> and to enable the slow tests you do: {{mvn -Dsling.slow.tests.enabled=true 
> clean test}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (SLING-10008) Add null annotations to package org.apache.sling.discovery (Discovery API)

2023-03-27 Thread Stefan Egli (Jira)


 [ 
https://issues.apache.org/jira/browse/SLING-10008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli updated SLING-10008:

Fix Version/s: Discovery Oak 1.2.46
   (was: Discovery Oak 1.2.44)

> Add null annotations to package org.apache.sling.discovery (Discovery API)
> --
>
> Key: SLING-10008
> URL: https://issues.apache.org/jira/browse/SLING-10008
> Project: Sling
>  Issue Type: Improvement
>  Components: Discovery
>Reporter: Konrad Windszus
>Priority: Major
> Fix For: Discovery Oak 1.2.46
>
>
> In https://github.com/Adobe-Consulting-Services/acs-aem-commons/issues/2492 
> and https://github.com/Adobe-Consulting-Services/acs-aem-commons/issues/2498 
> there were potential NPEs uncovered. To prevent consumers from running into 
> those the Null annotations 
> (https://sling.apache.org/documentation/development/null-analysis.html) 
> should be added to the relevant classes there as well.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (SLING-11619) Restore safeguard mechanism for discovery config's int and long properties

2023-03-27 Thread Stefan Egli (Jira)


 [ 
https://issues.apache.org/jira/browse/SLING-11619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli closed SLING-11619.
---

> Restore safeguard mechanism for discovery config's int and long properties
> --
>
> Key: SLING-11619
> URL: https://issues.apache.org/jira/browse/SLING-11619
> Project: Sling
>  Issue Type: Task
>  Components: Discovery
>Affects Versions: Discovery Oak 1.2.40
>Reporter: Stefan Egli
>Assignee: Stefan Egli
>Priority: Minor
> Fix For: Discovery Oak 1.2.42
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> With the [update to parent 
> 47|https://github.com/apache/sling-org-apache-sling-discovery-oak/commit/c306408f36e7636c72b71805d2bb0e3e6f0f0e73#diff-73d443e41e9bfaa5e9c77b6db0e318079f1885f5a7ed9685aae9730209adc579]
>  the discovery.oak's Config "lost" the ability to gracefully deal with wrong 
> values, such as empty strings. It used to silently swallow these, but now 
> fails loudly with
> {noformat}
> org.osgi.service.component.ComponentException: 
> java.lang.NumberFormatException: For input string: ""
>   at 
> org.apache.felix.scr.impl.inject.internal.Annotations$Handler.invoke(Annotations.java:379)
>  [org.apache.felix.scr:2.2.0]
>   at com.sun.proxy.$Proxy368.backoffStandbyFactor(Unknown Source)
>   at org.apache.sling.discovery.oak.Config.configure(Config.java:238) 
> [org.apache.sling.discovery.oak:1.2.40]
>   at org.apache.sling.discovery.oak.Config.activate(Config.java:159) 
> [org.apache.sling.discovery.oak:1.2.40]
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (SLING-11805) Don't stop slingId cleanup upon PROPERTIES_CHANGED

2023-03-16 Thread Stefan Egli (Jira)


[ 
https://issues.apache.org/jira/browse/SLING-11805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17701206#comment-17701206
 ] 

Stefan Egli commented on SLING-11805:
-

* fix pushed
* PR ready for review : 
https://github.com/apache/sling-org-apache-sling-discovery-oak/pull/14

> Don't stop slingId cleanup upon PROPERTIES_CHANGED
> --
>
> Key: SLING-11805
> URL: https://issues.apache.org/jira/browse/SLING-11805
> Project: Sling
>  Issue Type: Improvement
>  Components: Discovery
>Affects Versions: Discovery Oak 1.2.40
>Reporter: Stefan Egli
>Assignee: Stefan Egli
>Priority: Minor
> Fix For: Discovery Oak 1.2.44
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> As a follow-up to SLING-10854 where the SlingIdCleanupTask was introduced. 
> The current implementation stops cleanup when it received a 
> PROPERTIES_CHANGED event. This is actually wrong. It should continue. The way 
> it is currently done has the effect that cleanup is only triggered upon a 
> TOPOLOGY_INIT or TOPOLOGY_CHANGED without a following PROPERTIES_CHANGED. 
> This current behaviour reduces the chances of the cleanup running - having 
> said that, the likelyhood of the cleanup eventually running is still very 
> high.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (SLING-11805) Don't stop slingId cleanup upon PROPERTIES_CHANGED

2023-03-16 Thread Stefan Egli (Jira)


[ 
https://issues.apache.org/jira/browse/SLING-11805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17701093#comment-17701093
 ] 

Stefan Egli commented on SLING-11805:
-

* test added that reproduces this - details in draft PR 
https://github.com/apache/sling-org-apache-sling-discovery-oak/pull/14
* next step is to fix the code 

> Don't stop slingId cleanup upon PROPERTIES_CHANGED
> --
>
> Key: SLING-11805
> URL: https://issues.apache.org/jira/browse/SLING-11805
> Project: Sling
>  Issue Type: Improvement
>  Components: Discovery
>Affects Versions: Discovery Oak 1.2.40
>Reporter: Stefan Egli
>Assignee: Stefan Egli
>Priority: Minor
> Fix For: Discovery Oak 1.2.44
>
>
> As a follow-up to SLING-10854 where the SlingIdCleanupTask was introduced. 
> The current implementation stops cleanup when it received a 
> PROPERTIES_CHANGED event. This is actually wrong. It should continue. The way 
> it is currently done has the effect that cleanup is only triggered upon a 
> TOPOLOGY_INIT or TOPOLOGY_CHANGED without a following PROPERTIES_CHANGED. 
> This current behaviour reduces the chances of the cleanup running - having 
> said that, the likelyhood of the cleanup eventually running is still very 
> high.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (SLING-11805) Don't stop slingId cleanup upon PROPERTIES_CHANGED

2023-03-16 Thread Stefan Egli (Jira)
Stefan Egli created SLING-11805:
---

 Summary: Don't stop slingId cleanup upon PROPERTIES_CHANGED
 Key: SLING-11805
 URL: https://issues.apache.org/jira/browse/SLING-11805
 Project: Sling
  Issue Type: Improvement
  Components: Discovery
Affects Versions: Discovery Oak 1.2.40
Reporter: Stefan Egli
Assignee: Stefan Egli
 Fix For: Discovery Oak 1.2.44


As a follow-up to SLING-10854 where the SlingIdCleanupTask was introduced. The 
current implementation stops cleanup when it received a PROPERTIES_CHANGED 
event. This is actually wrong. It should continue. The way it is currently done 
has the effect that cleanup is only triggered upon a TOPOLOGY_INIT or 
TOPOLOGY_CHANGED without a following PROPERTIES_CHANGED. This current behaviour 
reduces the chances of the cleanup running - having said that, the likelyhood 
of the cleanup eventually running is still very high.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (SLING-10854) Introduce cleanup job of old slingId data in discovery

2023-03-14 Thread Stefan Egli (Jira)


 [ 
https://issues.apache.org/jira/browse/SLING-10854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli reassigned SLING-10854:
---

Assignee: Stefan Egli

> Introduce cleanup job of old slingId data in discovery
> --
>
> Key: SLING-10854
> URL: https://issues.apache.org/jira/browse/SLING-10854
> Project: Sling
>  Issue Type: Improvement
>  Components: Discovery
>Affects Versions: Discovery Oak 1.2.34
>    Reporter: Stefan Egli
>Assignee: Stefan Egli
>Priority: Major
> Fix For: Discovery Oak 1.2.44
>
>  Time Spent: 11h 10m
>  Remaining Estimate: 0h
>
> Discovery.oak stores nodes and properties per slingId under 
> {{/var/discovery/oak}}. In a scenario where the slingIds are stable things 
> are fine. If the slingIds change frequently, old slingId-related data stays 
> as garbage and accumulates.
> We should introduce a cleanup job to delete old slingId data. The leader 
> could execute this to avoid race conditions. We might need to add some 
> additional property to indicate age of slingIds (there's already the 
> {{/var/discovery/oak/clusterInstances/leaderElectionIdCreatedAt}} property 
> which gets updated upon each discovery.oak bundle activation - but it's 
> somewhat indirect. Having a new, dedicated property sounds cleaner (this one 
> could be used to clean up old data though)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (SLING-10854) Introduce cleanup job of old slingId data in discovery

2023-03-14 Thread Stefan Egli (Jira)


 [ 
https://issues.apache.org/jira/browse/SLING-10854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli resolved SLING-10854.
-
Resolution: Fixed

PR merged

> Introduce cleanup job of old slingId data in discovery
> --
>
> Key: SLING-10854
> URL: https://issues.apache.org/jira/browse/SLING-10854
> Project: Sling
>  Issue Type: Improvement
>  Components: Discovery
>Affects Versions: Discovery Oak 1.2.34
>    Reporter: Stefan Egli
>Assignee: Stefan Egli
>Priority: Major
> Fix For: Discovery Oak 1.2.44
>
>  Time Spent: 11h 10m
>  Remaining Estimate: 0h
>
> Discovery.oak stores nodes and properties per slingId under 
> {{/var/discovery/oak}}. In a scenario where the slingIds are stable things 
> are fine. If the slingIds change frequently, old slingId-related data stays 
> as garbage and accumulates.
> We should introduce a cleanup job to delete old slingId data. The leader 
> could execute this to avoid race conditions. We might need to add some 
> additional property to indicate age of slingIds (there's already the 
> {{/var/discovery/oak/clusterInstances/leaderElectionIdCreatedAt}} property 
> which gets updated upon each discovery.oak bundle activation - but it's 
> somewhat indirect. Having a new, dedicated property sounds cleaner (this one 
> could be used to clean up old data though)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (SLING-10854) Introduce cleanup job of old slingId data in discovery

2023-03-07 Thread Stefan Egli (Jira)


[ 
https://issues.apache.org/jira/browse/SLING-10854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17697504#comment-17697504
 ] 

Stefan Egli commented on SLING-10854:
-

* draft PR created at 
https://github.com/apache/sling-org-apache-sling-discovery-oak/pull/13

> Introduce cleanup job of old slingId data in discovery
> --
>
> Key: SLING-10854
> URL: https://issues.apache.org/jira/browse/SLING-10854
> Project: Sling
>  Issue Type: Improvement
>  Components: Discovery
>Affects Versions: Discovery Oak 1.2.34
>Reporter: Stefan Egli
>Priority: Major
> Fix For: Discovery Oak 1.2.44
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Discovery.oak stores nodes and properties per slingId under 
> {{/var/discovery/oak}}. In a scenario where the slingIds are stable things 
> are fine. If the slingIds change frequently, old slingId-related data stays 
> as garbage and accumulates.
> We should introduce a cleanup job to delete old slingId data. The leader 
> could execute this to avoid race conditions. We might need to add some 
> additional property to indicate age of slingIds (there's already the 
> {{/var/discovery/oak/clusterInstances/leaderElectionIdCreatedAt}} property 
> which gets updated upon each discovery.oak bundle activation - but it's 
> somewhat indirect. Having a new, dedicated property sounds cleaner (this one 
> could be used to clean up old data though)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (SLING-10854) Introduce cleanup job of old slingId data in discovery

2023-03-07 Thread Stefan Egli (Jira)


 [ 
https://issues.apache.org/jira/browse/SLING-10854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli updated SLING-10854:

Priority: Major  (was: Minor)

> Introduce cleanup job of old slingId data in discovery
> --
>
> Key: SLING-10854
> URL: https://issues.apache.org/jira/browse/SLING-10854
> Project: Sling
>  Issue Type: Improvement
>  Components: Discovery
>Affects Versions: Discovery Oak 1.2.34
>    Reporter: Stefan Egli
>Priority: Major
> Fix For: Discovery Oak 1.2.44
>
>
> Discovery.oak stores nodes and properties per slingId under 
> {{/var/discovery/oak}}. In a scenario where the slingIds are stable things 
> are fine. If the slingIds change frequently, old slingId-related data stays 
> as garbage and accumulates.
> We should introduce a cleanup job to delete old slingId data. The leader 
> could execute this to avoid race conditions. We might need to add some 
> additional property to indicate age of slingIds (there's already the 
> {{/var/discovery/oak/clusterInstances/leaderElectionIdCreatedAt}} property 
> which gets updated upon each discovery.oak bundle activation - but it's 
> somewhat indirect. Having a new, dedicated property sounds cleaner (this one 
> could be used to clean up old data though)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: Please welcome Julian Reschke

2023-02-09 Thread Stefan Egli

Welcome Julian!

Cheers,
STefan

On 09.02.23 15:19, Nicolas Peltier wrote:

Welcome Julian!

Le jeu. 9 févr. 2023 à 13:59, Jörg Hoh  a
écrit :


Welcome Julian!

Am Mi., 8. Feb. 2023 um 19:40 Uhr schrieb Julian Reschke <
julian.resc...@gmx.de>:


On 08.02.2023 17:43, ang...@apache.org wrote:

Hi Sling community,

Based on his contributions to the project, the Sling PMC has elected
Julian Reschke as a Sling committer, and he has accepted the

invitation.


Please join me in welcoming Julian.

Julian - if you want to honor the old tradition of new committers
briefly introducing themselves to the list, feel free.

Welcome again and kind regards
Angela


Sure.

Hi Sling friends,

I've been doing development in the Jackrabbit project(s) for quite some
time, and now have been drawn into the Sling space to work on
performance enhancements in the Resource Resolver. Likely there'll be
more related stuff in the future.

When I'm not working on Apache stuff, I occasionally spend time working
in the standards space in the IETF; you might have seen my name on a few
RFCs. (And, a very long time ago, in the JSR space on the javax.jcr
specs). So if there are questions related to JCR or HTTP/servlets, feel
free to ping me. (And yes, I also still like doing XMLly stuff).

Best regards, Julian

PS: disclaimer - almost all of the work I'm doing over here is funded by
Adobe.

PPS: I'm also infamous for actually developing and testing stuff on
Windows :-)





--
Cheers,
Jörg Hoh,

https://cqdump.joerghoh.de
Twitter: @joerghoh





[jira] [Closed] (SLING-10624) Callback when SlingRepository init fails

2023-02-09 Thread Stefan Egli (Jira)


 [ 
https://issues.apache.org/jira/browse/SLING-10624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli closed SLING-10624.
---

> Callback when SlingRepository init fails
> 
>
> Key: SLING-10624
> URL: https://issues.apache.org/jira/browse/SLING-10624
> Project: Sling
>  Issue Type: Improvement
>  Components: JCR
>Reporter: Marcel Reutegger
>    Assignee: Stefan Egli
>Priority: Minor
> Fix For: JCR Base 3.1.10
>
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> {{AbstractSlingRepositoryManager}} initializes the repository asynchronously 
> in a separate thread. This makes it difficult for an implementing subclass to 
> detect when initialization fails. An implementing class calls {{start()}}, 
> which returns almost immediately, while the repository is starting up 
> asynchronously. There is no way to detect that {{start()}} was successful.
> There should be a callback method that can be overwritten by the implementing 
> class. The method would be called when initialization fails, before 
> {{stop()}} is finally called.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (SLING-11662) Endless loop in QuartzSchedulerThread.run() with maxPoolSize == queueSize

2022-11-17 Thread Stefan Egli (Jira)


[ 
https://issues.apache.org/jira/browse/SLING-11662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17635219#comment-17635219
 ] 

Stefan Egli commented on SLING-11662:
-

[~cziegeler], great thx! I'll have a look as soon as possible, was also 
planning to add perhaps another test case or so (but have to first check how 
much coverage there is for these edge cases)..

> Endless loop in QuartzSchedulerThread.run() with maxPoolSize == queueSize
> -
>
> Key: SLING-11662
> URL: https://issues.apache.org/jira/browse/SLING-11662
> Project: Sling
>  Issue Type: Bug
>  Components: Commons
>Affects Versions: Commons Scheduler 2.7.12
>Reporter: Stefan Egli
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When configuring the ThreadPool with maxPoolSize == queueSize and endless 
> loop (can) happen(s) in QuartzSchedulerThread.run() which manifests as 
> follows:
> {noformat}
> "MyPool_QuartzSchedulerThread" #123 prio=5 os_prio=0 cpu=5123456.78ms 
> elapsed=5163.45s tid=0x12345678ff00 nid=0x1234 runnable  
> [0x87654321ff00]
>java.lang.Thread.State: RUNNABLE
> at 
> org.quartz.core.QuartzSchedulerThread.run(QuartzSchedulerThread.java:413)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (SLING-11662) Endless loop in QuartzSchedulerThread.run() with maxPoolSize == queueSize

2022-11-03 Thread Stefan Egli (Jira)


[ 
https://issues.apache.org/jira/browse/SLING-11662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17628445#comment-17628445
 ] 

Stefan Egli commented on SLING-11662:
-

(and if there was a reason then we could prevent activation when 
{{maxPoolSize==queueSize}} is configured)

> Endless loop in QuartzSchedulerThread.run() with maxPoolSize == queueSize
> -
>
> Key: SLING-11662
> URL: https://issues.apache.org/jira/browse/SLING-11662
> Project: Sling
>  Issue Type: Bug
>  Components: Commons
>Affects Versions: Commons Scheduler 2.7.12
>Reporter: Stefan Egli
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> When configuring the ThreadPool with maxPoolSize == queueSize and endless 
> loop (can) happen(s) in QuartzSchedulerThread.run() which manifests as 
> follows:
> {noformat}
> "MyPool_QuartzSchedulerThread" #123 prio=5 os_prio=0 cpu=5123456.78ms 
> elapsed=5163.45s tid=0x12345678ff00 nid=0x1234 runnable  
> [0x87654321ff00]
>java.lang.Thread.State: RUNNABLE
> at 
> org.quartz.core.QuartzSchedulerThread.run(QuartzSchedulerThread.java:413)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (SLING-11662) Endless loop in QuartzSchedulerThread.run() with maxPoolSize == queueSize

2022-11-03 Thread Stefan Egli (Jira)


[ 
https://issues.apache.org/jira/browse/SLING-11662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17628443#comment-17628443
 ] 

Stefan Egli edited comment on SLING-11662 at 11/3/22 5:36 PM:
--

[~cziegeler], what I was wondering what the idea of the original (current) 
implementation of blockForAvailableThreads was.. is there a reason it looks at 
(static) configuration rather than actual threads vs queued runnables?


was (Author: egli):
[~cziegeler], what I was wondering what the idea of the original (current) 
implementation of blockForAvailableThreads was.. is there a reason it looks a 
(static) configuration rather than actual threads vs queued runnables?

> Endless loop in QuartzSchedulerThread.run() with maxPoolSize == queueSize
> -
>
> Key: SLING-11662
> URL: https://issues.apache.org/jira/browse/SLING-11662
> Project: Sling
>  Issue Type: Bug
>  Components: Commons
>Affects Versions: Commons Scheduler 2.7.12
>Reporter: Stefan Egli
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> When configuring the ThreadPool with maxPoolSize == queueSize and endless 
> loop (can) happen(s) in QuartzSchedulerThread.run() which manifests as 
> follows:
> {noformat}
> "MyPool_QuartzSchedulerThread" #123 prio=5 os_prio=0 cpu=5123456.78ms 
> elapsed=5163.45s tid=0x12345678ff00 nid=0x1234 runnable  
> [0x87654321ff00]
>java.lang.Thread.State: RUNNABLE
> at 
> org.quartz.core.QuartzSchedulerThread.run(QuartzSchedulerThread.java:413)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (SLING-11662) Endless loop in QuartzSchedulerThread.run() with maxPoolSize == queueSize

2022-11-03 Thread Stefan Egli (Jira)


[ 
https://issues.apache.org/jira/browse/SLING-11662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17628443#comment-17628443
 ] 

Stefan Egli commented on SLING-11662:
-

[~cziegeler], what I was wondering what the idea of the original (current) 
implementation of blockForAvailableThreads was.. is there a reason it looks a 
(static) configuration rather than actual threads vs queued runnables?

> Endless loop in QuartzSchedulerThread.run() with maxPoolSize == queueSize
> -
>
> Key: SLING-11662
> URL: https://issues.apache.org/jira/browse/SLING-11662
> Project: Sling
>  Issue Type: Bug
>  Components: Commons
>Affects Versions: Commons Scheduler 2.7.12
>Reporter: Stefan Egli
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> When configuring the ThreadPool with maxPoolSize == queueSize and endless 
> loop (can) happen(s) in QuartzSchedulerThread.run() which manifests as 
> follows:
> {noformat}
> "MyPool_QuartzSchedulerThread" #123 prio=5 os_prio=0 cpu=5123456.78ms 
> elapsed=5163.45s tid=0x12345678ff00 nid=0x1234 runnable  
> [0x87654321ff00]
>java.lang.Thread.State: RUNNABLE
> at 
> org.quartz.core.QuartzSchedulerThread.run(QuartzSchedulerThread.java:413)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (SLING-11662) Endless loop in QuartzSchedulerThread.run() with maxPoolSize == queueSize

2022-11-03 Thread Stefan Egli (Jira)


[ 
https://issues.apache.org/jira/browse/SLING-11662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17628427#comment-17628427
 ] 

Stefan Egli commented on SLING-11662:
-

The problem with the endless loop seems to be due to a breach of contract by 
sling.commons.scheduler.QuartzThreadPool:
* [quartz' ThreadPool 
javadoc|https://github.com/quartz-scheduler/quartz/blob/v2.3.2/quartz-core/src/main/java/org/quartz/spi/ThreadPool.java#L69-L82]
 says that
{quote}The implementation of this method should block until there is at least 
one available thread.{quote}
* however 
[sling.commons.scheduler.QuartzThreadPool#blockForAvailableThreads|https://github.com/apache/sling-org-apache-sling-commons-scheduler/blob/a9ddf38ea9d9962c8938a381135827072fc9397f/src/main/java/org/apache/sling/commons/scheduler/impl/QuartzThreadPool.java#L80]
 does not guarantee {{>0}} - and in particular if "maxPoolSize == queueSize" 
then this method will return 0
* that in turn leads quartz to hit the [ironically commented 
line|https://github.com/quartz-scheduler/quartz/blob/v2.3.2/quartz-core/src/main/java/org/quartz/core/QuartzSchedulerThread.java#L411-L414]
{code}
} else { // if(availThreadCount > 0)
// should never happen, if 
threadPool.blockForAvailableThreads() follows contract
continue; // while (!halted)
}
{code}
so it will just .. continue
* now this game repeats for ever after until .. the CPU becomes too hot due to 
constant 100% spinning and it breaks down .. leading to damage in a datacenter 
and so on and so forth

Presumably quartz did nothing wrong here - other than perhaps add a 
safety/paranoia {{Thread.sleep(1);}} before that {{continue}} to avoid this. 
The problem is rather on the sling side.

Question is how to best fix this.. [~cziegeler], [~joerghoh], any suggestions?

> Endless loop in QuartzSchedulerThread.run() with maxPoolSize == queueSize
> -
>
> Key: SLING-11662
> URL: https://issues.apache.org/jira/browse/SLING-11662
> Project: Sling
>  Issue Type: Bug
>  Components: Commons
>Affects Versions: Commons Scheduler 2.7.12
>Reporter: Stefan Egli
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> When configuring the ThreadPool with maxPoolSize == queueSize and endless 
> loop (can) happen(s) in QuartzSchedulerThread.run() which manifests as 
> follows:
> {noformat}
> "MyPool_QuartzSchedulerThread" #123 prio=5 os_prio=0 cpu=5123456.78ms 
> elapsed=5163.45s tid=0x12345678ff00 nid=0x1234 runnable  
> [0x87654321ff00]
>java.lang.Thread.State: RUNNABLE
> at 
> org.quartz.core.QuartzSchedulerThread.run(QuartzSchedulerThread.java:413)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (SLING-11662) Endless loop in QuartzSchedulerThread.run() with maxPoolSize == queueSize

2022-11-03 Thread Stefan Egli (Jira)


[ 
https://issues.apache.org/jira/browse/SLING-11662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17628292#comment-17628292
 ] 

Stefan Egli commented on SLING-11662:
-

[PR 
created|https://github.com/apache/sling-org-apache-sling-commons-scheduler/pull/6]
 that reproduces the 100% cpu / endless loop in QuartzSchedulerThread.run()

> Endless loop in QuartzSchedulerThread.run() with maxPoolSize == queueSize
> -
>
> Key: SLING-11662
> URL: https://issues.apache.org/jira/browse/SLING-11662
> Project: Sling
>  Issue Type: Bug
>  Components: Commons
>Affects Versions: Commons Scheduler 2.7.12
>Reporter: Stefan Egli
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When configuring the ThreadPool with maxPoolSize == queueSize and endless 
> loop (can) happen(s) in QuartzSchedulerThread.run() which manifests as 
> follows:
> {noformat}
> "MyPool_QuartzSchedulerThread" #123 prio=5 os_prio=0 cpu=5123456.78ms 
> elapsed=5163.45s tid=0x12345678ff00 nid=0x1234 runnable  
> [0x87654321ff00]
>java.lang.Thread.State: RUNNABLE
> at 
> org.quartz.core.QuartzSchedulerThread.run(QuartzSchedulerThread.java:413)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (SLING-11662) Endless loop in QuartzSchedulerThread.run() with maxPoolSize == queueSize

2022-11-03 Thread Stefan Egli (Jira)
Stefan Egli created SLING-11662:
---

 Summary: Endless loop in QuartzSchedulerThread.run() with 
maxPoolSize == queueSize
 Key: SLING-11662
 URL: https://issues.apache.org/jira/browse/SLING-11662
 Project: Sling
  Issue Type: Bug
  Components: Commons
Affects Versions: Commons Scheduler 2.7.12
Reporter: Stefan Egli


When configuring the ThreadPool with maxPoolSize == queueSize and endless loop 
(can) happen(s) in QuartzSchedulerThread.run() which manifests as follows:

{noformat}
"MyPool_QuartzSchedulerThread" #123 prio=5 os_prio=0 cpu=5123456.78ms 
elapsed=5163.45s tid=0x12345678ff00 nid=0x1234 runnable  
[0x87654321ff00]
   java.lang.Thread.State: RUNNABLE
at 
org.quartz.core.QuartzSchedulerThread.run(QuartzSchedulerThread.java:413)
{noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[RESULT] [VOTE] Release Apache Sling Oak-Based Discovery Service 1.2.42

2022-10-31 Thread Stefan Egli

Hi,

The vote has passed with the following result:

+1 (binding): Jörg Hoh, Carsten Ziegeler, Robert Munteanu, myself

I will copy this release to the Sling dist directory and
promote the artifacts to the central Maven repository.

Cheers,
Stefan


On 27.10.22 17:44, Stefan Egli wrote:

Hi,

We solved 1 issue in this release:
https://issues.apache.org/jira/browse/SLING/fixforversion/12352142

There are still some outstanding issues:
https://issues.apache.org/jira/projects/SLING/versions/12352471

Staging repository:
https://repository.apache.org/content/repositories/orgapachesling-2688/

You can use this UNIX script to download the release and verify the signatures:
https://gitbox.apache.org/repos/asf?p=sling-tooling-release.git;a=blob;f=check_staged_release.sh;hb=HEAD 



Usage:
sh check_staged_release.sh 2688 /tmp/sling-staging

Please vote to approve this release:

   [ ] +1 Approve the release
   [ ]  0 Don't care
   [ ] -1 Don't release, because ...

This majority vote is open for at least 72 hours.

Cheers,
Stefan


Re: [VOTE] Release Apache Sling Oak-Based Discovery Service 1.2.42

2022-10-31 Thread Stefan Egli

+1

Cheers,
Stefan

On 27.10.22 17:44, Stefan Egli wrote:

Hi,

We solved 1 issue in this release:
https://issues.apache.org/jira/browse/SLING/fixforversion/12352142

There are still some outstanding issues:
https://issues.apache.org/jira/projects/SLING/versions/12352471

Staging repository:
https://repository.apache.org/content/repositories/orgapachesling-2688/

You can use this UNIX script to download the release and verify the signatures:
https://gitbox.apache.org/repos/asf?p=sling-tooling-release.git;a=blob;f=check_staged_release.sh;hb=HEAD 



Usage:
sh check_staged_release.sh 2688 /tmp/sling-staging

Please vote to approve this release:

   [ ] +1 Approve the release
   [ ]  0 Don't care
   [ ] -1 Don't release, because ...

This majority vote is open for at least 72 hours.

Cheers,
Stefan


[VOTE] Release Apache Sling Oak-Based Discovery Service 1.2.42

2022-10-27 Thread Stefan Egli

Hi,

We solved 1 issue in this release:
https://issues.apache.org/jira/browse/SLING/fixforversion/12352142

There are still some outstanding issues:
https://issues.apache.org/jira/projects/SLING/versions/12352471

Staging repository:
https://repository.apache.org/content/repositories/orgapachesling-2688/

You can use this UNIX script to download the release and verify the signatures:
https://gitbox.apache.org/repos/asf?p=sling-tooling-release.git;a=blob;f=check_staged_release.sh;hb=HEAD

Usage:
sh check_staged_release.sh 2688 /tmp/sling-staging

Please vote to approve this release:

  [ ] +1 Approve the release
  [ ]  0 Don't care
  [ ] -1 Don't release, because ...

This majority vote is open for at least 72 hours.

Cheers,
Stefan


[jira] [Updated] (SLING-11619) Restore safeguard mechanism for discovery config's int and long properties

2022-10-27 Thread Stefan Egli (Jira)


 [ 
https://issues.apache.org/jira/browse/SLING-11619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli updated SLING-11619:

Summary: Restore safeguard mechanism for discovery config's int and long 
properties  (was: Restore safeguard mechanism for discovery config)

> Restore safeguard mechanism for discovery config's int and long properties
> --
>
> Key: SLING-11619
> URL: https://issues.apache.org/jira/browse/SLING-11619
> Project: Sling
>  Issue Type: Task
>  Components: Discovery
>Affects Versions: Discovery Oak 1.2.40
>Reporter: Stefan Egli
>Assignee: Stefan Egli
>Priority: Minor
> Fix For: Discovery Oak 1.2.42
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> With the [update to parent 
> 47|https://github.com/apache/sling-org-apache-sling-discovery-oak/commit/c306408f36e7636c72b71805d2bb0e3e6f0f0e73#diff-73d443e41e9bfaa5e9c77b6db0e318079f1885f5a7ed9685aae9730209adc579]
>  the discovery.oak's Config "lost" the ability to gracefully deal with wrong 
> values, such as empty strings. It used to silently swallow these, but now 
> fails loudly with
> {noformat}
> org.osgi.service.component.ComponentException: 
> java.lang.NumberFormatException: For input string: ""
>   at 
> org.apache.felix.scr.impl.inject.internal.Annotations$Handler.invoke(Annotations.java:379)
>  [org.apache.felix.scr:2.2.0]
>   at com.sun.proxy.$Proxy368.backoffStandbyFactor(Unknown Source)
>   at org.apache.sling.discovery.oak.Config.configure(Config.java:238) 
> [org.apache.sling.discovery.oak:1.2.40]
>   at org.apache.sling.discovery.oak.Config.activate(Config.java:159) 
> [org.apache.sling.discovery.oak:1.2.40]
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (SLING-10008) Add null annotations to package org.apache.sling.discovery (Discovery API)

2022-10-27 Thread Stefan Egli (Jira)


 [ 
https://issues.apache.org/jira/browse/SLING-10008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli updated SLING-10008:

Fix Version/s: Discovery Oak 1.2.44

> Add null annotations to package org.apache.sling.discovery (Discovery API)
> --
>
> Key: SLING-10008
> URL: https://issues.apache.org/jira/browse/SLING-10008
> Project: Sling
>  Issue Type: Improvement
>  Components: Discovery
>Reporter: Konrad Windszus
>Priority: Major
> Fix For: Discovery Oak 1.2.44
>
>
> In https://github.com/Adobe-Consulting-Services/acs-aem-commons/issues/2492 
> and https://github.com/Adobe-Consulting-Services/acs-aem-commons/issues/2498 
> there were potential NPEs uncovered. To prevent consumers from running into 
> those the Null annotations 
> (https://sling.apache.org/documentation/development/null-analysis.html) 
> should be added to the relevant classes there as well.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (SLING-10322) Upgrade discovery.* to parent 41

2022-10-27 Thread Stefan Egli (Jira)


 [ 
https://issues.apache.org/jira/browse/SLING-10322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli closed SLING-10322.
---
Assignee: Stefan Egli

> Upgrade discovery.* to parent 41
> 
>
> Key: SLING-10322
> URL: https://issues.apache.org/jira/browse/SLING-10322
> Project: Sling
>  Issue Type: Task
>  Components: Discovery
>    Reporter: Stefan Egli
>    Assignee: Stefan Egli
>Priority: Major
> Fix For: Discovery Oak 1.2.40
>
>
> Discovery.* still use rather old parent versions. They should be upgraded to 
> eg 41. This will involve quite some changes, including replacing 
> felix.scr.annotations.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (SLING-10322) Upgrade discovery.* to parent 41

2022-10-27 Thread Stefan Egli (Jira)


 [ 
https://issues.apache.org/jira/browse/SLING-10322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli resolved SLING-10322.
-
Resolution: Fixed

This has meanwhile been done in SLING-11355. Marking resolved.

> Upgrade discovery.* to parent 41
> 
>
> Key: SLING-10322
> URL: https://issues.apache.org/jira/browse/SLING-10322
> Project: Sling
>  Issue Type: Task
>  Components: Discovery
>    Reporter: Stefan Egli
>Priority: Major
> Fix For: Discovery Oak 1.2.40
>
>
> Discovery.* still use rather old parent versions. They should be upgraded to 
> eg 41. This will involve quite some changes, including replacing 
> felix.scr.annotations.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (SLING-10322) Upgrade discovery.* to parent 41

2022-10-27 Thread Stefan Egli (Jira)


 [ 
https://issues.apache.org/jira/browse/SLING-10322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli updated SLING-10322:

Fix Version/s: Discovery Oak 1.2.40

> Upgrade discovery.* to parent 41
> 
>
> Key: SLING-10322
> URL: https://issues.apache.org/jira/browse/SLING-10322
> Project: Sling
>  Issue Type: Task
>  Components: Discovery
>    Reporter: Stefan Egli
>Priority: Major
> Fix For: Discovery Oak 1.2.40
>
>
> Discovery.* still use rather old parent versions. They should be upgraded to 
> eg 41. This will involve quite some changes, including replacing 
> felix.scr.annotations.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (SLING-10813) Improve ViewStateManagerImpl.waitForAsyncEvents, also speeds up tests

2022-10-27 Thread Stefan Egli (Jira)


 [ 
https://issues.apache.org/jira/browse/SLING-10813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli updated SLING-10813:

Fix Version/s: Discovery Oak 1.2.44

> Improve ViewStateManagerImpl.waitForAsyncEvents, also speeds up tests
> -
>
> Key: SLING-10813
> URL: https://issues.apache.org/jira/browse/SLING-10813
> Project: Sling
>  Issue Type: Improvement
>  Components: Discovery
>    Reporter: Stefan Egli
>Priority: Minor
> Fix For: Discovery Oak 1.2.44
>
>
> As discussed [in this 
> PR|https://github.com/apache/sling-org-apache-sling-discovery-oak/pull/4#discussion_r708292265]
>  the ViewStateManagerImpl.waitForAsyncEvents returning currently requires a 
> {{Thread.sleep()}} to ensure anything that was "just triggered" has finished 
> executing asynchronously.
> This should be improved in this waitForAsyncEvent method, by being more 
> precise about when it returns (ie include any call to 
> {{asyncEvent.trigger()}} having terminated)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (SLING-10854) Introduce cleanup job of old slingId data in discovery

2022-10-27 Thread Stefan Egli (Jira)


 [ 
https://issues.apache.org/jira/browse/SLING-10854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli updated SLING-10854:

Fix Version/s: Discovery Oak 1.2.44

> Introduce cleanup job of old slingId data in discovery
> --
>
> Key: SLING-10854
> URL: https://issues.apache.org/jira/browse/SLING-10854
> Project: Sling
>  Issue Type: Improvement
>  Components: Discovery
>Affects Versions: Discovery Oak 1.2.34
>    Reporter: Stefan Egli
>Priority: Minor
> Fix For: Discovery Oak 1.2.44
>
>
> Discovery.oak stores nodes and properties per slingId under 
> {{/var/discovery/oak}}. In a scenario where the slingIds are stable things 
> are fine. If the slingIds change frequently, old slingId-related data stays 
> as garbage and accumulates.
> We should introduce a cleanup job to delete old slingId data. The leader 
> could execute this to avoid race conditions. We might need to add some 
> additional property to indicate age of slingIds (there's already the 
> {{/var/discovery/oak/clusterInstances/leaderElectionIdCreatedAt}} property 
> which gets updated upon each discovery.oak bundle activation - but it's 
> somewhat indirect. Having a new, dedicated property sounds cleaner (this one 
> could be used to clean up old data though)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (SLING-11619) Restore safeguard mechanism for discovery config

2022-10-27 Thread Stefan Egli (Jira)


 [ 
https://issues.apache.org/jira/browse/SLING-11619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli updated SLING-11619:

Affects Version/s: Discovery Oak 1.2.40

> Restore safeguard mechanism for discovery config
> 
>
> Key: SLING-11619
> URL: https://issues.apache.org/jira/browse/SLING-11619
> Project: Sling
>  Issue Type: Task
>  Components: Discovery
>Affects Versions: Discovery Oak 1.2.40
>    Reporter: Stefan Egli
>Assignee: Stefan Egli
>Priority: Minor
> Fix For: Discovery Oak 1.2.42
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> With the [update to parent 
> 47|https://github.com/apache/sling-org-apache-sling-discovery-oak/commit/c306408f36e7636c72b71805d2bb0e3e6f0f0e73#diff-73d443e41e9bfaa5e9c77b6db0e318079f1885f5a7ed9685aae9730209adc579]
>  the discovery.oak's Config "lost" the ability to gracefully deal with wrong 
> values, such as empty strings. It used to silently swallow these, but now 
> fails loudly with
> {noformat}
> org.osgi.service.component.ComponentException: 
> java.lang.NumberFormatException: For input string: ""
>   at 
> org.apache.felix.scr.impl.inject.internal.Annotations$Handler.invoke(Annotations.java:379)
>  [org.apache.felix.scr:2.2.0]
>   at com.sun.proxy.$Proxy368.backoffStandbyFactor(Unknown Source)
>   at org.apache.sling.discovery.oak.Config.configure(Config.java:238) 
> [org.apache.sling.discovery.oak:1.2.40]
>   at org.apache.sling.discovery.oak.Config.activate(Config.java:159) 
> [org.apache.sling.discovery.oak:1.2.40]
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (SLING-9625) DiscoveryServiceImpl#doUpdateProperties may fail due to a LoginException

2022-10-27 Thread Stefan Egli (Jira)


 [ 
https://issues.apache.org/jira/browse/SLING-9625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli updated SLING-9625:
---
Fix Version/s: Discovery Oak 1.2.44
   (was: Discovery Oak 1.2.42)

> DiscoveryServiceImpl#doUpdateProperties may fail due to a LoginException 
> -
>
> Key: SLING-9625
> URL: https://issues.apache.org/jira/browse/SLING-9625
> Project: Sling
>  Issue Type: Improvement
>Affects Versions: Discovery Oak 1.2.30
>Reporter: Konrad Windszus
>Priority: Major
> Fix For: Discovery Oak 1.2.44
>
>
> While stopping the OSGi container (Sling Starter 12 SNAPSHOT) I observed the 
> following error
> {code}
> 03.08.2020 10:30:06.262 *INFO * [Apache Sling Terminator] Stopping Apache 
> Sling
> ERROR: bundle org.apache.sling.discovery.oak:1.2.28 
> (139)[org.apache.sling.discovery.oak.OakDiscoveryService(200)] : The 
> updatedPropertyProvider method has thrown an exception
> java.lang.RuntimeException: Could not log in to repository 
> (org.apache.sling.api.resource.LoginException: Cannot derive user name for 
> bundle org.apache.sling.discovery.oak [139] and sub service null)
>   at 
> org.apache.sling.discovery.oak.OakDiscoveryService.doUpdateProperties(OakDiscoveryService.java:540)
>   at 
> org.apache.sling.discovery.oak.OakDiscoveryService.bindPropertyProviderInteral(OakDiscoveryService.java:406)
>   at 
> org.apache.sling.discovery.oak.OakDiscoveryService.updatedPropertyProvider(OakDiscoveryService.java:421)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.felix.scr.impl.inject.methods.BaseMethod.invokeMethod(BaseMethod.java:242)
>   at 
> org.apache.felix.scr.impl.inject.methods.BaseMethod.access$500(BaseMethod.java:41)
>   at 
> org.apache.felix.scr.impl.inject.methods.BaseMethod$Resolved.invoke(BaseMethod.java:678)
>   at 
> org.apache.felix.scr.impl.inject.methods.BaseMethod$NotResolved.invoke(BaseMethod.java:633)
>   at 
> org.apache.felix.scr.impl.inject.methods.BaseMethod.invoke(BaseMethod.java:524)
>   at 
> org.apache.felix.scr.impl.inject.methods.BindMethod.invoke(BindMethod.java:42)
>   at 
> org.apache.felix.scr.impl.manager.DependencyManager.invokeUpdatedMethod(DependencyManager.java:1934)
>   at 
> org.apache.felix.scr.impl.manager.SingleComponentManager.invokeUpdatedMethod(SingleComponentManager.java:448)
>   at 
> org.apache.felix.scr.impl.manager.DependencyManager$MultipleDynamicCustomizer.modifiedService(DependencyManager.java:366)
>   at 
> org.apache.felix.scr.impl.manager.DependencyManager$MultipleDynamicCustomizer.modifiedService(DependencyManager.java:297)
>   at 
> org.apache.felix.scr.impl.manager.ServiceTracker$Tracked.customizerModified(ServiceTracker.java:1229)
>   at 
> org.apache.felix.scr.impl.manager.ServiceTracker$Tracked.customizerModified(ServiceTracker.java:1137)
>   at 
> org.apache.felix.scr.impl.manager.ServiceTracker$AbstractTracked.track(ServiceTracker.java:883)
>   at 
> org.apache.felix.scr.impl.manager.ServiceTracker$Tracked.serviceChanged(ServiceTracker.java:1168)
>   at 
> org.apache.felix.scr.impl.BundleComponentActivator$ListenerInfo.serviceChanged(BundleComponentActivator.java:125)
>   at 
> org.apache.felix.framework.EventDispatcher.invokeServiceListenerCallback(EventDispatcher.java:990)
>   at 
> org.apache.felix.framework.EventDispatcher.fireEventImmediately(EventDispatcher.java:838)
>   at 
> org.apache.felix.framework.EventDispatcher.fireServiceEvent(EventDispatcher.java:545)
>   at org.apache.felix.framework.Felix.fireServiceEvent(Felix.java:4833)
>   at org.apache.felix.framework.Felix.access$000(Felix.java:112)
>   at org.apache.felix.framework.Felix$1.serviceChanged(Felix.java:434)
>   at 
> org.apache.felix.framework.ServiceRegistry.servicePropertiesModified(ServiceRegistry.java:601)
>   at 
> org.apache.felix.framework.ServiceRegistrationImpl.setProperties(ServiceRegistrationImpl.java:132)
>   at 
> org.apache.sling.event.impl.jobs.JobConsumerManager.unbindService(JobConsumerManager.java:354)
>   at 
> org.apache.sling.event.impl.jobs.JobConsumerManager.unbindJobExecutor(JobConsumerManager.java:270)
>   at sun.reflec

[jira] [Updated] (SLING-5598) Exclude slow tests by default with assume(sling.slow.tests.enabled)

2022-10-27 Thread Stefan Egli (Jira)


 [ 
https://issues.apache.org/jira/browse/SLING-5598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli updated SLING-5598:
---
Fix Version/s: Discovery Oak 1.2.44
   (was: Discovery Oak 1.2.42)

> Exclude slow tests by default with assume(sling.slow.tests.enabled) 
> 
>
> Key: SLING-5598
> URL: https://issues.apache.org/jira/browse/SLING-5598
> Project: Sling
>  Issue Type: Task
>  Components: Extensions
>Affects Versions: Discovery Impl 1.2.6, Discovery Base 1.1.2, Discovery 
> Commons 1.0.10, Discovery Oak 1.2.6
>Reporter: Stefan Egli
>Priority: Major
> Fix For: Discovery Impl 1.2.14, Discovery Base 2.0.16, Discovery 
> Commons 1.0.30, Discovery Oak 1.2.44
>
> Attachments: SLING-5598-commons-testing.patch, 
> SLING-5598-discovery.patch
>
>
> As suggested by [~bdelacretaz] on [the 
> list|http://markmail.org/message/yad5awqg53epk3ck] we should improve test 
> duration (ideally 1-2min per bundle max, 10-15min overall). While they are 
> not yet improved however, slow tests should be excluded by default and run 
> only if enabled explicitly. Here's an example {{@Before}} method to achieve 
> that:
> {noformat}
> @Before
> public void checkSlowTests() {
> assumeNotNull(System.getProperty("sling.slow.tests.enabled"));
> }
> {noformat}
> and to enable the slow tests you do: {{mvn -Dsling.slow.tests.enabled=true 
> clean test}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (SLING-11619) Restore safeguard mechanism for discovery config

2022-10-27 Thread Stefan Egli (Jira)


 [ 
https://issues.apache.org/jira/browse/SLING-11619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli resolved SLING-11619.
-
Fix Version/s: Discovery Oak 1.2.42
   Resolution: Fixed

PR merged

> Restore safeguard mechanism for discovery config
> 
>
> Key: SLING-11619
> URL: https://issues.apache.org/jira/browse/SLING-11619
> Project: Sling
>  Issue Type: Task
>  Components: Discovery
>    Reporter: Stefan Egli
>    Assignee: Stefan Egli
>Priority: Minor
> Fix For: Discovery Oak 1.2.42
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> With the [update to parent 
> 47|https://github.com/apache/sling-org-apache-sling-discovery-oak/commit/c306408f36e7636c72b71805d2bb0e3e6f0f0e73#diff-73d443e41e9bfaa5e9c77b6db0e318079f1885f5a7ed9685aae9730209adc579]
>  the discovery.oak's Config "lost" the ability to gracefully deal with wrong 
> values, such as empty strings. It used to silently swallow these, but now 
> fails loudly with
> {noformat}
> org.osgi.service.component.ComponentException: 
> java.lang.NumberFormatException: For input string: ""
>   at 
> org.apache.felix.scr.impl.inject.internal.Annotations$Handler.invoke(Annotations.java:379)
>  [org.apache.felix.scr:2.2.0]
>   at com.sun.proxy.$Proxy368.backoffStandbyFactor(Unknown Source)
>   at org.apache.sling.discovery.oak.Config.configure(Config.java:238) 
> [org.apache.sling.discovery.oak:1.2.40]
>   at org.apache.sling.discovery.oak.Config.activate(Config.java:159) 
> [org.apache.sling.discovery.oak:1.2.40]
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (SLING-11619) Restore safeguard mechanism for discovery config

2022-10-24 Thread Stefan Egli (Jira)


[ 
https://issues.apache.org/jira/browse/SLING-11619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17623237#comment-17623237
 ] 

Stefan Egli commented on SLING-11619:
-

updated the PR and marked ready for review

> Restore safeguard mechanism for discovery config
> 
>
> Key: SLING-11619
> URL: https://issues.apache.org/jira/browse/SLING-11619
> Project: Sling
>  Issue Type: Task
>  Components: Discovery
>    Reporter: Stefan Egli
>Assignee: Stefan Egli
>Priority: Minor
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> With the [update to parent 
> 47|https://github.com/apache/sling-org-apache-sling-discovery-oak/commit/c306408f36e7636c72b71805d2bb0e3e6f0f0e73#diff-73d443e41e9bfaa5e9c77b6db0e318079f1885f5a7ed9685aae9730209adc579]
>  the discovery.oak's Config "lost" the ability to gracefully deal with wrong 
> values, such as empty strings. It used to silently swallow these, but now 
> fails loudly with
> {noformat}
> org.osgi.service.component.ComponentException: 
> java.lang.NumberFormatException: For input string: ""
>   at 
> org.apache.felix.scr.impl.inject.internal.Annotations$Handler.invoke(Annotations.java:379)
>  [org.apache.felix.scr:2.2.0]
>   at com.sun.proxy.$Proxy368.backoffStandbyFactor(Unknown Source)
>   at org.apache.sling.discovery.oak.Config.configure(Config.java:238) 
> [org.apache.sling.discovery.oak:1.2.40]
>   at org.apache.sling.discovery.oak.Config.activate(Config.java:159) 
> [org.apache.sling.discovery.oak:1.2.40]
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (SLING-11619) Restore safeguard mechanism for discovery config

2022-10-12 Thread Stefan Egli (Jira)


[ 
https://issues.apache.org/jira/browse/SLING-11619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17616564#comment-17616564
 ] 

Stefan Egli commented on SLING-11619:
-

Started work in a [draft 
PR|https://github.com/apache/sling-org-apache-sling-discovery-oak/pull/11]

> Restore safeguard mechanism for discovery config
> 
>
> Key: SLING-11619
> URL: https://issues.apache.org/jira/browse/SLING-11619
> Project: Sling
>  Issue Type: Task
>  Components: Discovery
>    Reporter: Stefan Egli
>Assignee: Stefan Egli
>Priority: Minor
>
> With the [update to parent 
> 47|https://github.com/apache/sling-org-apache-sling-discovery-oak/commit/c306408f36e7636c72b71805d2bb0e3e6f0f0e73#diff-73d443e41e9bfaa5e9c77b6db0e318079f1885f5a7ed9685aae9730209adc579]
>  the discovery.oak's Config "lost" the ability to gracefully deal with wrong 
> values, such as empty strings. It used to silently swallow these, but now 
> fails loudly with
> {noformat}
> org.osgi.service.component.ComponentException: 
> java.lang.NumberFormatException: For input string: ""
>   at 
> org.apache.felix.scr.impl.inject.internal.Annotations$Handler.invoke(Annotations.java:379)
>  [org.apache.felix.scr:2.2.0]
>   at com.sun.proxy.$Proxy368.backoffStandbyFactor(Unknown Source)
>   at org.apache.sling.discovery.oak.Config.configure(Config.java:238) 
> [org.apache.sling.discovery.oak:1.2.40]
>   at org.apache.sling.discovery.oak.Config.activate(Config.java:159) 
> [org.apache.sling.discovery.oak:1.2.40]
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (SLING-11619) Restore safeguard mechanism for discovery config

2022-10-12 Thread Stefan Egli (Jira)
Stefan Egli created SLING-11619:
---

 Summary: Restore safeguard mechanism for discovery config
 Key: SLING-11619
 URL: https://issues.apache.org/jira/browse/SLING-11619
 Project: Sling
  Issue Type: Task
  Components: Discovery
Reporter: Stefan Egli
Assignee: Stefan Egli


With the [update to parent 
47|https://github.com/apache/sling-org-apache-sling-discovery-oak/commit/c306408f36e7636c72b71805d2bb0e3e6f0f0e73#diff-73d443e41e9bfaa5e9c77b6db0e318079f1885f5a7ed9685aae9730209adc579]
 the discovery.oak's Config "lost" the ability to gracefully deal with wrong 
values, such as empty strings. It used to silently swallow these, but now fails 
loudly with
{noformat}
org.osgi.service.component.ComponentException: java.lang.NumberFormatException: 
For input string: ""
at 
org.apache.felix.scr.impl.inject.internal.Annotations$Handler.invoke(Annotations.java:379)
 [org.apache.felix.scr:2.2.0]
at com.sun.proxy.$Proxy368.backoffStandbyFactor(Unknown Source)
at org.apache.sling.discovery.oak.Config.configure(Config.java:238) 
[org.apache.sling.discovery.oak:1.2.40]
at org.apache.sling.discovery.oak.Config.activate(Config.java:159) 
[org.apache.sling.discovery.oak:1.2.40]
{noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (SLING-11450) Partially started instance suppression can lead to unwanted leader loss

2022-08-03 Thread Stefan Egli (Jira)


 [ 
https://issues.apache.org/jira/browse/SLING-11450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli closed SLING-11450.
---

> Partially started instance suppression can lead to unwanted leader loss
> ---
>
> Key: SLING-11450
> URL: https://issues.apache.org/jira/browse/SLING-11450
> Project: Sling
>  Issue Type: Bug
>  Components: Discovery
>Affects Versions: Discovery Base 2.0.12, Discovery Oak 1.2.36
>    Reporter: Stefan Egli
>Assignee: Stefan Egli
>Priority: Major
> Fix For: Discovery Base 2.0.14, Discovery Oak 1.2.38
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> SLING-10489 introduced "partial startup suppression" sometimes also referred 
> to as "joinerdelay" (even though the latter is actually a subfeature of the 
> former).
> With this suppression enabled (it is disabled by default), upon a topology 
> change the leader instance can loose its leader status even though it did not 
> actually leave the topology or crash. This is against the discovery API 
> contract, which says that the leader stays leader until it crashes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (SLING-11496) Fresh instance must remain suppressed until syncToken stored

2022-08-03 Thread Stefan Egli (Jira)


 [ 
https://issues.apache.org/jira/browse/SLING-11496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli closed SLING-11496.
---

> Fresh instance must remain suppressed until syncToken stored
> 
>
> Key: SLING-11496
> URL: https://issues.apache.org/jira/browse/SLING-11496
> Project: Sling
>  Issue Type: Task
>  Components: Discovery
>Affects Versions: Discovery Oak 1.2.36
>    Reporter: Stefan Egli
>Assignee: Stefan Egli
>Priority: Major
> Fix For: Discovery Oak 1.2.40
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The changes in SLING-11450 have one case still missing : if an instance 
> reuses the clusterNodeId but is slow, it is not suppressed. Reason being that 
> there's no cleanup of data in /var/discovery/oak/idMap and 
> ./clusterInstances. So if it reuses the clusterNodeId, the old data from a 
> previous instance would still be there, and the other instances do not 
> distinguish where the data originated.
> The only way to detect a clusterNodeId-reuse is to require it to update the 
> syncToken. Until it doesn't do that it is suppressed. Once it does it, it 
> joins the cluster regularly. From then on, then syncToken is no longer 
> checked (since existing instances are excempted from that check).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[RESULT] [VOTE] Release Apache Sling Oak-Based Discovery Service 1.2.40 and Sling Discovery Commons 1.0.28

2022-08-03 Thread Stefan Egli

Hi,

The vote has passed with the following result :

+1 (binding): Nicolas Peltier, Carsten Ziegeler, Eric Norman
+1 (non binding): Ashok Pelluru

I will copy this release to the Sling dist directory and
promote the artifacts to the central Maven repository.

Cheers,
Stefan

On 26.07.22 17:51, Stefan Egli wrote:

Hi,

We solved 2 issues in this release:
https://issues.apache.org/jira/browse/SLING/fixforversion/12352121
https://issues.apache.org/jira/browse/SLING/fixforversion/12351433

There 2 outstanding issues:
https://issues.apache.org/jira/browse/SLING/fixforversion/12352142
https://issues.apache.org/jira/browse/SLING/fixforversion/12352141

Staging repository:
https://repository.apache.org/content/repositories/orgapachesling-2658/


=
NOTE : discovery.oak 1.2.40 supersedes 1.2.38 which is currently in voting as well. I decided 
against cancelling that vote though, as it contains also dependent discovery.base 2.0.14. So please 
consider voting for the other as well, brings good karma!

=


You can use this UNIX script to download the release and verify the signatures:
https://gitbox.apache.org/repos/asf?p=sling-tooling-release.git;a=blob;f=check_staged_release.sh;hb=HEAD 



Usage:
sh check_staged_release.sh 2658 /tmp/sling-staging

Please vote to approve this release:

   [ ] +1 Approve the release
   [ ]  0 Don't care
   [ ] -1 Don't release, because ...

This majority vote is open for at least 72 hours.

Cheers,
Stefan


[RESULT] [VOTE] Release Apache Sling Oak-Based Discovery Service 1.2.38 and Sling Discovery Base 2.0.14

2022-08-03 Thread Stefan Egli

Hi,

The vote has passed with the following result :

+1 (binding): Eric Norman, Joerg Hoh, myself
+1 (non binding): Ashok Pelluru

I will copy this release to the Sling dist directory and
promote the artifacts to the central Maven repository.

Cheers,
Stefan

On 21.07.22 16:46, Stefan Egli wrote:

Hi,

We solved 2 issues in this release:
https://issues.apache.org/jira/browse/SLING/fixforversion/12351434
https://issues.apache.org/jira/browse/SLING/fixforversion/12351439

There 2 outstanding issues:
https://issues.apache.org/jira/browse/SLING/fixforversion/12352120
https://issues.apache.org/jira/browse/SLING/fixforversion/12352121

Staging repository:
https://repository.apache.org/content/repositories/orgapachesling-2657/


You can use this UNIX script to download the release and verify the signatures:
https://gitbox.apache.org/repos/asf?p=sling-tooling-release.git;a=blob;f=check_staged_release.sh;hb=HEAD 



Usage:
sh check_staged_release.sh 2657 /tmp/sling-staging

Please vote to approve this release:

   [ ] +1 Approve the release
   [ ]  0 Don't care
   [ ] -1 Don't release, because ...

This majority vote is open for at least 72 hours.

Cheers,
Stefan


Re: [VOTE] Release Apache Sling Oak-Based Discovery Service 1.2.38 and Sling Discovery Base 2.0.14

2022-08-02 Thread Stefan Egli

Hi,

Friendly bump : this vote would need another binding +1

Thanks!
Cheers,
Stefan

On 27.07.22 19:20, Eric Norman wrote:

+1

On Thu, Jul 21, 2022 at 7:46 AM Stefan Egli  wrote:


Hi,

We solved 2 issues in this release:
https://issues.apache.org/jira/browse/SLING/fixforversion/12351434
https://issues.apache.org/jira/browse/SLING/fixforversion/12351439

There 2 outstanding issues:
https://issues.apache.org/jira/browse/SLING/fixforversion/12352120
https://issues.apache.org/jira/browse/SLING/fixforversion/12352121

Staging repository:
https://repository.apache.org/content/repositories/orgapachesling-2657/


You can use this UNIX script to download the release and verify the
signatures:

https://gitbox.apache.org/repos/asf?p=sling-tooling-release.git;a=blob;f=check_staged_release.sh;hb=HEAD

Usage:
sh check_staged_release.sh 2657 /tmp/sling-staging

Please vote to approve this release:

[ ] +1 Approve the release
[ ]  0 Don't care
[ ] -1 Don't release, because ...

This majority vote is open for at least 72 hours.

Cheers,
Stefan





[VOTE] Release Apache Sling Oak-Based Discovery Service 1.2.40 and Sling Discovery Commons 1.0.28

2022-07-26 Thread Stefan Egli

Hi,

We solved 2 issues in this release:
https://issues.apache.org/jira/browse/SLING/fixforversion/12352121
https://issues.apache.org/jira/browse/SLING/fixforversion/12351433

There 2 outstanding issues:
https://issues.apache.org/jira/browse/SLING/fixforversion/12352142
https://issues.apache.org/jira/browse/SLING/fixforversion/12352141

Staging repository:
https://repository.apache.org/content/repositories/orgapachesling-2658/


=
NOTE : discovery.oak 1.2.40 supersedes 1.2.38 which is currently in voting as well. I decided 
against cancelling that vote though, as it contains also dependent discovery.base 2.0.14. So please 
consider voting for the other as well, brings good karma!

=


You can use this UNIX script to download the release and verify the signatures:
https://gitbox.apache.org/repos/asf?p=sling-tooling-release.git;a=blob;f=check_staged_release.sh;hb=HEAD

Usage:
sh check_staged_release.sh 2658 /tmp/sling-staging

Please vote to approve this release:

  [ ] +1 Approve the release
  [ ]  0 Don't care
  [ ] -1 Don't release, because ...

This majority vote is open for at least 72 hours.

Cheers,
Stefan


[jira] [Updated] (SLING-11496) Fresh instance must remain suppressed until syncToken stored

2022-07-26 Thread Stefan Egli (Jira)


 [ 
https://issues.apache.org/jira/browse/SLING-11496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli updated SLING-11496:

Fix Version/s: Discovery Oak 1.2.40
   (was: Discovery Oak 1.2.38)

> Fresh instance must remain suppressed until syncToken stored
> 
>
> Key: SLING-11496
> URL: https://issues.apache.org/jira/browse/SLING-11496
> Project: Sling
>  Issue Type: Task
>  Components: Discovery
>Affects Versions: Discovery Oak 1.2.36
>    Reporter: Stefan Egli
>Assignee: Stefan Egli
>Priority: Major
> Fix For: Discovery Oak 1.2.40
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The changes in SLING-11450 have one case still missing : if an instance 
> reuses the clusterNodeId but is slow, it is not suppressed. Reason being that 
> there's no cleanup of data in /var/discovery/oak/idMap and 
> ./clusterInstances. So if it reuses the clusterNodeId, the old data from a 
> previous instance would still be there, and the other instances do not 
> distinguish where the data originated.
> The only way to detect a clusterNodeId-reuse is to require it to update the 
> syncToken. Until it doesn't do that it is suppressed. Once it does it, it 
> joins the cluster regularly. From then on, then syncToken is no longer 
> checked (since existing instances are excempted from that check).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (SLING-11496) Fresh instance must remain suppressed until syncToken stored

2022-07-26 Thread Stefan Egli (Jira)


 [ 
https://issues.apache.org/jira/browse/SLING-11496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli updated SLING-11496:

Affects Version/s: Discovery Oak 1.2.36

> Fresh instance must remain suppressed until syncToken stored
> 
>
> Key: SLING-11496
> URL: https://issues.apache.org/jira/browse/SLING-11496
> Project: Sling
>  Issue Type: Task
>  Components: Discovery
>Affects Versions: Discovery Oak 1.2.36
>    Reporter: Stefan Egli
>Assignee: Stefan Egli
>Priority: Major
> Fix For: Discovery Oak 1.2.38
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The changes in SLING-11450 have one case still missing : if an instance 
> reuses the clusterNodeId but is slow, it is not suppressed. Reason being that 
> there's no cleanup of data in /var/discovery/oak/idMap and 
> ./clusterInstances. So if it reuses the clusterNodeId, the old data from a 
> previous instance would still be there, and the other instances do not 
> distinguish where the data originated.
> The only way to detect a clusterNodeId-reuse is to require it to update the 
> syncToken. Until it doesn't do that it is suppressed. Once it does it, it 
> joins the cluster regularly. From then on, then syncToken is no longer 
> checked (since existing instances are excempted from that check).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (SLING-9625) DiscoveryServiceImpl#doUpdateProperties may fail due to a LoginException

2022-07-26 Thread Stefan Egli (Jira)


 [ 
https://issues.apache.org/jira/browse/SLING-9625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli updated SLING-9625:
---
Fix Version/s: Discovery Oak 1.2.42
   (was: Discovery Oak 1.2.40)

> DiscoveryServiceImpl#doUpdateProperties may fail due to a LoginException 
> -
>
> Key: SLING-9625
> URL: https://issues.apache.org/jira/browse/SLING-9625
> Project: Sling
>  Issue Type: Improvement
>Affects Versions: Discovery Oak 1.2.30
>Reporter: Konrad Windszus
>Priority: Major
> Fix For: Discovery Oak 1.2.42
>
>
> While stopping the OSGi container (Sling Starter 12 SNAPSHOT) I observed the 
> following error
> {code}
> 03.08.2020 10:30:06.262 *INFO * [Apache Sling Terminator] Stopping Apache 
> Sling
> ERROR: bundle org.apache.sling.discovery.oak:1.2.28 
> (139)[org.apache.sling.discovery.oak.OakDiscoveryService(200)] : The 
> updatedPropertyProvider method has thrown an exception
> java.lang.RuntimeException: Could not log in to repository 
> (org.apache.sling.api.resource.LoginException: Cannot derive user name for 
> bundle org.apache.sling.discovery.oak [139] and sub service null)
>   at 
> org.apache.sling.discovery.oak.OakDiscoveryService.doUpdateProperties(OakDiscoveryService.java:540)
>   at 
> org.apache.sling.discovery.oak.OakDiscoveryService.bindPropertyProviderInteral(OakDiscoveryService.java:406)
>   at 
> org.apache.sling.discovery.oak.OakDiscoveryService.updatedPropertyProvider(OakDiscoveryService.java:421)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.felix.scr.impl.inject.methods.BaseMethod.invokeMethod(BaseMethod.java:242)
>   at 
> org.apache.felix.scr.impl.inject.methods.BaseMethod.access$500(BaseMethod.java:41)
>   at 
> org.apache.felix.scr.impl.inject.methods.BaseMethod$Resolved.invoke(BaseMethod.java:678)
>   at 
> org.apache.felix.scr.impl.inject.methods.BaseMethod$NotResolved.invoke(BaseMethod.java:633)
>   at 
> org.apache.felix.scr.impl.inject.methods.BaseMethod.invoke(BaseMethod.java:524)
>   at 
> org.apache.felix.scr.impl.inject.methods.BindMethod.invoke(BindMethod.java:42)
>   at 
> org.apache.felix.scr.impl.manager.DependencyManager.invokeUpdatedMethod(DependencyManager.java:1934)
>   at 
> org.apache.felix.scr.impl.manager.SingleComponentManager.invokeUpdatedMethod(SingleComponentManager.java:448)
>   at 
> org.apache.felix.scr.impl.manager.DependencyManager$MultipleDynamicCustomizer.modifiedService(DependencyManager.java:366)
>   at 
> org.apache.felix.scr.impl.manager.DependencyManager$MultipleDynamicCustomizer.modifiedService(DependencyManager.java:297)
>   at 
> org.apache.felix.scr.impl.manager.ServiceTracker$Tracked.customizerModified(ServiceTracker.java:1229)
>   at 
> org.apache.felix.scr.impl.manager.ServiceTracker$Tracked.customizerModified(ServiceTracker.java:1137)
>   at 
> org.apache.felix.scr.impl.manager.ServiceTracker$AbstractTracked.track(ServiceTracker.java:883)
>   at 
> org.apache.felix.scr.impl.manager.ServiceTracker$Tracked.serviceChanged(ServiceTracker.java:1168)
>   at 
> org.apache.felix.scr.impl.BundleComponentActivator$ListenerInfo.serviceChanged(BundleComponentActivator.java:125)
>   at 
> org.apache.felix.framework.EventDispatcher.invokeServiceListenerCallback(EventDispatcher.java:990)
>   at 
> org.apache.felix.framework.EventDispatcher.fireEventImmediately(EventDispatcher.java:838)
>   at 
> org.apache.felix.framework.EventDispatcher.fireServiceEvent(EventDispatcher.java:545)
>   at org.apache.felix.framework.Felix.fireServiceEvent(Felix.java:4833)
>   at org.apache.felix.framework.Felix.access$000(Felix.java:112)
>   at org.apache.felix.framework.Felix$1.serviceChanged(Felix.java:434)
>   at 
> org.apache.felix.framework.ServiceRegistry.servicePropertiesModified(ServiceRegistry.java:601)
>   at 
> org.apache.felix.framework.ServiceRegistrationImpl.setProperties(ServiceRegistrationImpl.java:132)
>   at 
> org.apache.sling.event.impl.jobs.JobConsumerManager.unbindService(JobConsumerManager.java:354)
>   at 
> org.apache.sling.event.impl.jobs.JobConsumerManager.unbindJobExecutor(JobConsumerManager.java:270)
>   at sun.reflec

[jira] [Updated] (SLING-11355) Update parent bundle (48) to sling-discovery modules

2022-07-26 Thread Stefan Egli (Jira)


 [ 
https://issues.apache.org/jira/browse/SLING-11355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli updated SLING-11355:

Fix Version/s: Discovery Oak 1.2.40
   (was: Discovery Oak 1.2.38)

> Update parent bundle (48) to sling-discovery modules
> 
>
> Key: SLING-11355
> URL: https://issues.apache.org/jira/browse/SLING-11355
> Project: Sling
>  Issue Type: Sub-task
>Reporter: Ashok Pelluru
>Assignee: Ashok Pelluru
>Priority: Major
> Fix For: Discovery Impl 1.2.14, Discovery Support 1.0.6, 
> Discovery Commons 1.0.28, Discovery Base 2.0.14, Discovery Oak 1.2.40
>
>  Time Spent: 11h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (SLING-5598) Exclude slow tests by default with assume(sling.slow.tests.enabled)

2022-07-26 Thread Stefan Egli (Jira)


 [ 
https://issues.apache.org/jira/browse/SLING-5598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli updated SLING-5598:
---
Fix Version/s: Discovery Commons 1.0.30
   Discovery Oak 1.2.42
   (was: Discovery Commons 1.0.28)
   (was: Discovery Oak 1.2.40)

> Exclude slow tests by default with assume(sling.slow.tests.enabled) 
> 
>
> Key: SLING-5598
> URL: https://issues.apache.org/jira/browse/SLING-5598
> Project: Sling
>  Issue Type: Task
>  Components: Extensions
>Affects Versions: Discovery Impl 1.2.6, Discovery Base 1.1.2, Discovery 
> Commons 1.0.10, Discovery Oak 1.2.6
>Reporter: Stefan Egli
>Priority: Major
> Fix For: Discovery Impl 1.2.14, Discovery Base 2.0.16, Discovery 
> Commons 1.0.30, Discovery Oak 1.2.42
>
> Attachments: SLING-5598-commons-testing.patch, 
> SLING-5598-discovery.patch
>
>
> As suggested by [~bdelacretaz] on [the 
> list|http://markmail.org/message/yad5awqg53epk3ck] we should improve test 
> duration (ideally 1-2min per bundle max, 10-15min overall). While they are 
> not yet improved however, slow tests should be excluded by default and run 
> only if enabled explicitly. Here's an example {{@Before}} method to achieve 
> that:
> {noformat}
> @Before
> public void checkSlowTests() {
> assumeNotNull(System.getProperty("sling.slow.tests.enabled"));
> }
> {noformat}
> and to enable the slow tests you do: {{mvn -Dsling.slow.tests.enabled=true 
> clean test}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (SLING-11496) Fresh instance must remain suppressed until syncToken stored

2022-07-26 Thread Stefan Egli (Jira)


 [ 
https://issues.apache.org/jira/browse/SLING-11496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli resolved SLING-11496.
-
Resolution: Fixed

merged

> Fresh instance must remain suppressed until syncToken stored
> 
>
> Key: SLING-11496
> URL: https://issues.apache.org/jira/browse/SLING-11496
> Project: Sling
>  Issue Type: Task
>  Components: Discovery
>    Reporter: Stefan Egli
>    Assignee: Stefan Egli
>Priority: Major
> Fix For: Discovery Oak 1.2.38
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The changes in SLING-11450 have one case still missing : if an instance 
> reuses the clusterNodeId but is slow, it is not suppressed. Reason being that 
> there's no cleanup of data in /var/discovery/oak/idMap and 
> ./clusterInstances. So if it reuses the clusterNodeId, the old data from a 
> previous instance would still be there, and the other instances do not 
> distinguish where the data originated.
> The only way to detect a clusterNodeId-reuse is to require it to update the 
> syncToken. Until it doesn't do that it is suppressed. Once it does it, it 
> joins the cluster regularly. From then on, then syncToken is no longer 
> checked (since existing instances are excempted from that check).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (SLING-11496) Fresh instance must remain suppressed until syncToken stored

2022-07-25 Thread Stefan Egli (Jira)


[ 
https://issues.apache.org/jira/browse/SLING-11496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17570989#comment-17570989
 ] 

Stefan Egli commented on SLING-11496:
-

* created 
[PR#10|https://github.com/apache/sling-org-apache-sling-discovery-oak/pull/10]

> Fresh instance must remain suppressed until syncToken stored
> 
>
> Key: SLING-11496
> URL: https://issues.apache.org/jira/browse/SLING-11496
> Project: Sling
>  Issue Type: Task
>  Components: Discovery
>    Reporter: Stefan Egli
>Assignee: Stefan Egli
>Priority: Major
> Fix For: Discovery Oak 1.2.38
>
>
> The changes in SLING-11450 have one case still missing : if an instance 
> reuses the clusterNodeId but is slow, it is not suppressed. Reason being that 
> there's no cleanup of data in /var/discovery/oak/idMap and 
> ./clusterInstances. So if it reuses the clusterNodeId, the old data from a 
> previous instance would still be there, and the other instances do not 
> distinguish where the data originated.
> The only way to detect a clusterNodeId-reuse is to require it to update the 
> syncToken. Until it doesn't do that it is suppressed. Once it does it, it 
> joins the cluster regularly. From then on, then syncToken is no longer 
> checked (since existing instances are excempted from that check).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (SLING-11496) Fresh instance must remain suppressed until syncToken stored

2022-07-25 Thread Stefan Egli (Jira)


[ 
https://issues.apache.org/jira/browse/SLING-11496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17570925#comment-17570925
 ] 

Stefan Egli commented on SLING-11496:
-

* working on this 
[here|https://github.com/stefan-egli/sling-org-apache-sling-discovery-oak/tree/SLING-11496]

> Fresh instance must remain suppressed until syncToken stored
> 
>
> Key: SLING-11496
> URL: https://issues.apache.org/jira/browse/SLING-11496
> Project: Sling
>  Issue Type: Task
>  Components: Discovery
>    Reporter: Stefan Egli
>Assignee: Stefan Egli
>Priority: Major
> Fix For: Discovery Oak 1.2.38
>
>
> The changes in SLING-11450 have one case still missing : if an instance 
> reuses the clusterNodeId but is slow, it is not suppressed. Reason being that 
> there's no cleanup of data in /var/discovery/oak/idMap and 
> ./clusterInstances. So if it reuses the clusterNodeId, the old data from a 
> previous instance would still be there, and the other instances do not 
> distinguish where the data originated.
> The only way to detect a clusterNodeId-reuse is to require it to update the 
> syncToken. Until it doesn't do that it is suppressed. Once it does it, it 
> joins the cluster regularly. From then on, then syncToken is no longer 
> checked (since existing instances are excempted from that check).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (SLING-11496) Fresh instance must remain suppressed until syncToken stored

2022-07-25 Thread Stefan Egli (Jira)


 [ 
https://issues.apache.org/jira/browse/SLING-11496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli updated SLING-11496:

Fix Version/s: Discovery Oak 1.2.38

> Fresh instance must remain suppressed until syncToken stored
> 
>
> Key: SLING-11496
> URL: https://issues.apache.org/jira/browse/SLING-11496
> Project: Sling
>  Issue Type: Task
>  Components: Discovery
>    Reporter: Stefan Egli
>    Assignee: Stefan Egli
>Priority: Major
> Fix For: Discovery Oak 1.2.38
>
>
> The changes in SLING-11450 have one case still missing : if an instance 
> reuses the clusterNodeId but is slow, it is not suppressed. Reason being that 
> there's no cleanup of data in /var/discovery/oak/idMap and 
> ./clusterInstances. So if it reuses the clusterNodeId, the old data from a 
> previous instance would still be there, and the other instances do not 
> distinguish where the data originated.
> The only way to detect a clusterNodeId-reuse is to require it to update the 
> syncToken. Until it doesn't do that it is suppressed. Once it does it, it 
> joins the cluster regularly. From then on, then syncToken is no longer 
> checked (since existing instances are excempted from that check).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (SLING-11496) Fresh instance must remain suppressed until syncToken stored

2022-07-25 Thread Stefan Egli (Jira)
Stefan Egli created SLING-11496:
---

 Summary: Fresh instance must remain suppressed until syncToken 
stored
 Key: SLING-11496
 URL: https://issues.apache.org/jira/browse/SLING-11496
 Project: Sling
  Issue Type: Task
  Components: Discovery
Reporter: Stefan Egli
Assignee: Stefan Egli


The changes in SLING-11450 have one case still missing : if an instance reuses 
the clusterNodeId but is slow, it is not suppressed. Reason being that there's 
no cleanup of data in /var/discovery/oak/idMap and ./clusterInstances. So if it 
reuses the clusterNodeId, the old data from a previous instance would still be 
there, and the other instances do not distinguish where the data originated.

The only way to detect a clusterNodeId-reuse is to require it to update the 
syncToken. Until it doesn't do that it is suppressed. Once it does it, it joins 
the cluster regularly. From then on, then syncToken is no longer checked (since 
existing instances are excempted from that check).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] Release Apache Sling Oak-Based Discovery Service 1.2.38 and Sling Discovery Base 2.0.14

2022-07-25 Thread Stefan Egli

+1

self votes are da best!

Cheers,
Stefan

On 21.07.22 16:46, Stefan Egli wrote:

Hi,

We solved 2 issues in this release:
https://issues.apache.org/jira/browse/SLING/fixforversion/12351434
https://issues.apache.org/jira/browse/SLING/fixforversion/12351439

There 2 outstanding issues:
https://issues.apache.org/jira/browse/SLING/fixforversion/12352120
https://issues.apache.org/jira/browse/SLING/fixforversion/12352121

Staging repository:
https://repository.apache.org/content/repositories/orgapachesling-2657/


You can use this UNIX script to download the release and verify the signatures:
https://gitbox.apache.org/repos/asf?p=sling-tooling-release.git;a=blob;f=check_staged_release.sh;hb=HEAD 



Usage:
sh check_staged_release.sh 2657 /tmp/sling-staging

Please vote to approve this release:

   [ ] +1 Approve the release
   [ ]  0 Don't care
   [ ] -1 Don't release, because ...

This majority vote is open for at least 72 hours.

Cheers,
Stefan


[VOTE] Release Apache Sling Oak-Based Discovery Service 1.2.38 and Sling Discovery Base 2.0.14

2022-07-21 Thread Stefan Egli

Hi,

We solved 2 issues in this release:
https://issues.apache.org/jira/browse/SLING/fixforversion/12351434
https://issues.apache.org/jira/browse/SLING/fixforversion/12351439

There 2 outstanding issues:
https://issues.apache.org/jira/browse/SLING/fixforversion/12352120
https://issues.apache.org/jira/browse/SLING/fixforversion/12352121

Staging repository:
https://repository.apache.org/content/repositories/orgapachesling-2657/


You can use this UNIX script to download the release and verify the signatures:
https://gitbox.apache.org/repos/asf?p=sling-tooling-release.git;a=blob;f=check_staged_release.sh;hb=HEAD

Usage:
sh check_staged_release.sh 2657 /tmp/sling-staging

Please vote to approve this release:

  [ ] +1 Approve the release
  [ ]  0 Don't care
  [ ] -1 Don't release, because ...

This majority vote is open for at least 72 hours.

Cheers,
Stefan


[jira] [Updated] (SLING-9625) DiscoveryServiceImpl#doUpdateProperties may fail due to a LoginException

2022-07-21 Thread Stefan Egli (Jira)


 [ 
https://issues.apache.org/jira/browse/SLING-9625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli updated SLING-9625:
---
Fix Version/s: Discovery Oak 1.2.40
   (was: Discovery Oak 1.2.38)

> DiscoveryServiceImpl#doUpdateProperties may fail due to a LoginException 
> -
>
> Key: SLING-9625
> URL: https://issues.apache.org/jira/browse/SLING-9625
> Project: Sling
>  Issue Type: Improvement
>Affects Versions: Discovery Oak 1.2.30
>Reporter: Konrad Windszus
>Priority: Major
> Fix For: Discovery Oak 1.2.40
>
>
> While stopping the OSGi container (Sling Starter 12 SNAPSHOT) I observed the 
> following error
> {code}
> 03.08.2020 10:30:06.262 *INFO * [Apache Sling Terminator] Stopping Apache 
> Sling
> ERROR: bundle org.apache.sling.discovery.oak:1.2.28 
> (139)[org.apache.sling.discovery.oak.OakDiscoveryService(200)] : The 
> updatedPropertyProvider method has thrown an exception
> java.lang.RuntimeException: Could not log in to repository 
> (org.apache.sling.api.resource.LoginException: Cannot derive user name for 
> bundle org.apache.sling.discovery.oak [139] and sub service null)
>   at 
> org.apache.sling.discovery.oak.OakDiscoveryService.doUpdateProperties(OakDiscoveryService.java:540)
>   at 
> org.apache.sling.discovery.oak.OakDiscoveryService.bindPropertyProviderInteral(OakDiscoveryService.java:406)
>   at 
> org.apache.sling.discovery.oak.OakDiscoveryService.updatedPropertyProvider(OakDiscoveryService.java:421)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.felix.scr.impl.inject.methods.BaseMethod.invokeMethod(BaseMethod.java:242)
>   at 
> org.apache.felix.scr.impl.inject.methods.BaseMethod.access$500(BaseMethod.java:41)
>   at 
> org.apache.felix.scr.impl.inject.methods.BaseMethod$Resolved.invoke(BaseMethod.java:678)
>   at 
> org.apache.felix.scr.impl.inject.methods.BaseMethod$NotResolved.invoke(BaseMethod.java:633)
>   at 
> org.apache.felix.scr.impl.inject.methods.BaseMethod.invoke(BaseMethod.java:524)
>   at 
> org.apache.felix.scr.impl.inject.methods.BindMethod.invoke(BindMethod.java:42)
>   at 
> org.apache.felix.scr.impl.manager.DependencyManager.invokeUpdatedMethod(DependencyManager.java:1934)
>   at 
> org.apache.felix.scr.impl.manager.SingleComponentManager.invokeUpdatedMethod(SingleComponentManager.java:448)
>   at 
> org.apache.felix.scr.impl.manager.DependencyManager$MultipleDynamicCustomizer.modifiedService(DependencyManager.java:366)
>   at 
> org.apache.felix.scr.impl.manager.DependencyManager$MultipleDynamicCustomizer.modifiedService(DependencyManager.java:297)
>   at 
> org.apache.felix.scr.impl.manager.ServiceTracker$Tracked.customizerModified(ServiceTracker.java:1229)
>   at 
> org.apache.felix.scr.impl.manager.ServiceTracker$Tracked.customizerModified(ServiceTracker.java:1137)
>   at 
> org.apache.felix.scr.impl.manager.ServiceTracker$AbstractTracked.track(ServiceTracker.java:883)
>   at 
> org.apache.felix.scr.impl.manager.ServiceTracker$Tracked.serviceChanged(ServiceTracker.java:1168)
>   at 
> org.apache.felix.scr.impl.BundleComponentActivator$ListenerInfo.serviceChanged(BundleComponentActivator.java:125)
>   at 
> org.apache.felix.framework.EventDispatcher.invokeServiceListenerCallback(EventDispatcher.java:990)
>   at 
> org.apache.felix.framework.EventDispatcher.fireEventImmediately(EventDispatcher.java:838)
>   at 
> org.apache.felix.framework.EventDispatcher.fireServiceEvent(EventDispatcher.java:545)
>   at org.apache.felix.framework.Felix.fireServiceEvent(Felix.java:4833)
>   at org.apache.felix.framework.Felix.access$000(Felix.java:112)
>   at org.apache.felix.framework.Felix$1.serviceChanged(Felix.java:434)
>   at 
> org.apache.felix.framework.ServiceRegistry.servicePropertiesModified(ServiceRegistry.java:601)
>   at 
> org.apache.felix.framework.ServiceRegistrationImpl.setProperties(ServiceRegistrationImpl.java:132)
>   at 
> org.apache.sling.event.impl.jobs.JobConsumerManager.unbindService(JobConsumerManager.java:354)
>   at 
> org.apache.sling.event.impl.jobs.JobConsumerManager.unbindJobExecutor(JobConsumerManager.java:270)
>   at sun.reflec

[jira] [Updated] (SLING-5598) Exclude slow tests by default with assume(sling.slow.tests.enabled)

2022-07-21 Thread Stefan Egli (Jira)


 [ 
https://issues.apache.org/jira/browse/SLING-5598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli updated SLING-5598:
---
Fix Version/s: Discovery Base 2.0.16
   Discovery Oak 1.2.40
   (was: Discovery Base 2.0.14)
   (was: Discovery Oak 1.2.38)

> Exclude slow tests by default with assume(sling.slow.tests.enabled) 
> 
>
> Key: SLING-5598
> URL: https://issues.apache.org/jira/browse/SLING-5598
> Project: Sling
>  Issue Type: Task
>  Components: Extensions
>Affects Versions: Discovery Impl 1.2.6, Discovery Base 1.1.2, Discovery 
> Commons 1.0.10, Discovery Oak 1.2.6
>Reporter: Stefan Egli
>Priority: Major
> Fix For: Discovery Impl 1.2.14, Discovery Commons 1.0.28, 
> Discovery Base 2.0.16, Discovery Oak 1.2.40
>
> Attachments: SLING-5598-commons-testing.patch, 
> SLING-5598-discovery.patch
>
>
> As suggested by [~bdelacretaz] on [the 
> list|http://markmail.org/message/yad5awqg53epk3ck] we should improve test 
> duration (ideally 1-2min per bundle max, 10-15min overall). While they are 
> not yet improved however, slow tests should be excluded by default and run 
> only if enabled explicitly. Here's an example {{@Before}} method to achieve 
> that:
> {noformat}
> @Before
> public void checkSlowTests() {
> assumeNotNull(System.getProperty("sling.slow.tests.enabled"));
> }
> {noformat}
> and to enable the slow tests you do: {{mvn -Dsling.slow.tests.enabled=true 
> clean test}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (SLING-11450) Partially started instance suppression can lead to unwanted leader loss

2022-07-21 Thread Stefan Egli (Jira)


 [ 
https://issues.apache.org/jira/browse/SLING-11450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli resolved SLING-11450.
-
Resolution: Fixed

> Partially started instance suppression can lead to unwanted leader loss
> ---
>
> Key: SLING-11450
> URL: https://issues.apache.org/jira/browse/SLING-11450
> Project: Sling
>  Issue Type: Bug
>  Components: Discovery
>Affects Versions: Discovery Base 2.0.12, Discovery Oak 1.2.36
>    Reporter: Stefan Egli
>Assignee: Stefan Egli
>Priority: Major
> Fix For: Discovery Base 2.0.14, Discovery Oak 1.2.38
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> SLING-10489 introduced "partial startup suppression" sometimes also referred 
> to as "joinerdelay" (even though the latter is actually a subfeature of the 
> former).
> With this suppression enabled (it is disabled by default), upon a topology 
> change the leader instance can loose its leader status even though it did not 
> actually leave the topology or crash. This is against the discovery API 
> contract, which says that the leader stays leader until it crashes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (SLING-11450) Partially started instance suppression can lead to unwanted leader loss

2022-07-21 Thread Stefan Egli (Jira)


[ 
https://issues.apache.org/jira/browse/SLING-11450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17569456#comment-17569456
 ] 

Stefan Egli commented on SLING-11450:
-

* merged both PRs
* releasing both bundles next

> Partially started instance suppression can lead to unwanted leader loss
> ---
>
> Key: SLING-11450
> URL: https://issues.apache.org/jira/browse/SLING-11450
> Project: Sling
>  Issue Type: Bug
>  Components: Discovery
>Affects Versions: Discovery Base 2.0.12, Discovery Oak 1.2.36
>Reporter: Stefan Egli
>Assignee: Stefan Egli
>Priority: Major
> Fix For: Discovery Base 2.0.14, Discovery Oak 1.2.38
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> SLING-10489 introduced "partial startup suppression" sometimes also referred 
> to as "joinerdelay" (even though the latter is actually a subfeature of the 
> former).
> With this suppression enabled (it is disabled by default), upon a topology 
> change the leader instance can loose its leader status even though it did not 
> actually leave the topology or crash. This is against the discovery API 
> contract, which says that the leader stays leader until it crashes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (SLING-11470) Revert discovery.base impl separation, bump major package version instead

2022-07-20 Thread Stefan Egli (Jira)


 [ 
https://issues.apache.org/jira/browse/SLING-11470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli resolved SLING-11470.
-
Resolution: Fixed

Thx [~apelluru] for the review, merged the PR now.

> Revert discovery.base impl separation, bump major package version instead
> -
>
> Key: SLING-11470
> URL: https://issues.apache.org/jira/browse/SLING-11470
> Project: Sling
>  Issue Type: Task
>  Components: Discovery
>    Reporter: Stefan Egli
>    Assignee: Stefan Egli
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This is a follow-up of SLING-11355 ([discovery-base 
> PR#7|https://github.com/apache/sling-org-apache-sling-discovery-base/pull/7])
> As part of that PR impl classes of announcement and ping packages got moved 
> to impl subpackages to have better separation. Also, the package version bump 
> was suppressed.
> As now noticed by [~mreutegg], there is an issue with this change (that 
> blocks [discovery-oak 
> PR#7|https://github.com/apache/sling-org-apache-sling-discovery-oak/pull/7] ) 
> : the {{CachedAnnouncement}} is part if the announcement package's API but 
> was now made private by moving it to impl.
> Several options how to fix this probably, listing two of them:
>  # keep the impl class separated. But fix {{CachedAnnouncement}} by placing 
> it back to the public package. This would require the class to be split into 
> an interface/implementation pair to avoid making registerPing public. 
> Additionally, continue the impl separation for the ping package by the other 
> 2 remaining implementation classes also to impl : 
> {{TopologyConnectorServlet}} and {{TopologyRequestValidator}} (with 
> corresponding adjustments in tests).
>  # go back to the original, non separated way (even though this was not best 
> practice).
> Also:
> * in both cases I would actually argue (a bit late) to not overrule the 
> baseline check and actually do the major version bumps. In hindsight seems 
> more appropriate, as it would ensure downstream users do the required upgrade.
> So, I would vote for option 2 + package bumps, as these are fewer changes and 
> the discovery.base package is mostly really only used by discovery.oak these 
> days, so I don't see a strong need for beautifying and introducing impl 
> separation.
> [~apelluru], [~kwin], [~rombert], wdyt?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (SLING-11470) Revert discovery.base impl separation, bump major package version instead

2022-07-20 Thread Stefan Egli (Jira)


[ 
https://issues.apache.org/jira/browse/SLING-11470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17568968#comment-17568968
 ] 

Stefan Egli commented on SLING-11470:
-

* Created 
[PR#10|https://github.com/apache/sling-org-apache-sling-discovery-base/pull/10]

> Revert discovery.base impl separation, bump major package version instead
> -
>
> Key: SLING-11470
> URL: https://issues.apache.org/jira/browse/SLING-11470
> Project: Sling
>  Issue Type: Task
>  Components: Discovery
>    Reporter: Stefan Egli
>Assignee: Stefan Egli
>Priority: Major
>
> This is a follow-up of SLING-11355 ([discovery-base 
> PR#7|https://github.com/apache/sling-org-apache-sling-discovery-base/pull/7])
> As part of that PR impl classes of announcement and ping packages got moved 
> to impl subpackages to have better separation. Also, the package version bump 
> was suppressed.
> As now noticed by [~mreutegg], there is an issue with this change (that 
> blocks [discovery-oak 
> PR#7|https://github.com/apache/sling-org-apache-sling-discovery-oak/pull/7] ) 
> : the {{CachedAnnouncement}} is part if the announcement package's API but 
> was now made private by moving it to impl.
> Several options how to fix this probably, listing two of them:
>  # keep the impl class separated. But fix {{CachedAnnouncement}} by placing 
> it back to the public package. This would require the class to be split into 
> an interface/implementation pair to avoid making registerPing public. 
> Additionally, continue the impl separation for the ping package by the other 
> 2 remaining implementation classes also to impl : 
> {{TopologyConnectorServlet}} and {{TopologyRequestValidator}} (with 
> corresponding adjustments in tests).
>  # go back to the original, non separated way (even though this was not best 
> practice).
> Also:
> * in both cases I would actually argue (a bit late) to not overrule the 
> baseline check and actually do the major version bumps. In hindsight seems 
> more appropriate, as it would ensure downstream users do the required upgrade.
> So, I would vote for option 2 + package bumps, as these are fewer changes and 
> the discovery.base package is mostly really only used by discovery.oak these 
> days, so I don't see a strong need for beautifying and introducing impl 
> separation.
> [~apelluru], [~kwin], [~rombert], wdyt?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


  1   2   3   4   5   6   7   8   9   10   >