Re: Semaphore Stuck when no acquirers to assign permit

2018-01-15 Thread Timay
I saw a release date set for 2.4 but have not had any feedback on the jira so
i wanted to check in on this. Can this make it into the 2.4 release?

Thanks
Tim



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Re: Cache.query with ScanQuery Hangs forever

2018-01-03 Thread Timay
it looks like it was classpath loading issue with "Entry::getValue". Since
this is a lambda it will try to serialize the containing class which is not
expected at the cluster nor necessary. When i killed the process I got stack
trace(below) point to that. So i assume some issue with around the class
path loading.

/class org.apache.ignite.IgniteCheckedException: Query execution failed:
GridCacheQueryBean [qry=GridCacheQueryAdapter [type=SCAN, clsName=null,
clause=null, filter=ManagerServiceImpl$InactivePredicate@220fd437,
transform=*ManagerServiceImpl$$Lambda$32/1251084807@7b343e3*/


This is something i have seen before during a compute operation, during an
exception handling we had a library that was not supplied to the cluster get
referenced and it failed silently but would not release the thread. Are any
issues known around that? I may take a look if i can, but any info would me
begin. 

Thanks. 
Tim



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Cache.query with ScanQuery Hangs forever

2018-01-02 Thread Timay
Hey all, using Ignite 2.1 we are trying to use an IgniteCache.query to
retrieve records that match our query. We attempt this query on a timed
basis and it gets executed  when their may be no records on the cache.
Example below. 

   ScanQuery query = new ScanQuery<>(new
MatchingPredicate(name));
   List matches = _cache.query(query, Entry::getValue).getAll();


>From the thread dump (below) it looks like in
GridCacheQueryFutureAdapter.internalIterator we get put into a
wait(Long.MAX_VAL). Which will just hang until i am long gone. I am
guessing, since i am not sure of all the uses cases, that their is a bug
when a cache is empty and a query is run we may end up in the wait. So my
questions are:

1) Whats the point of such a long wait?
2) Can we adjust the timeout on a ScanQuery basis? 
 

"Thread-15" - Thread t@79
   java.lang.Thread.State: TIMED_WAITING
at java.lang.Object.wait(Native Method)
- waiting on <46b8b561> (a
org.apache.ignite.internal.processors.cache.query.GridCacheDistributedQueryFuture)
at
org.apache.ignite.internal.processors.cache.query.GridCacheQueryFutureAdapter.internalIterator(GridCacheQueryFutureAdapter.java:304)
at
org.apache.ignite.internal.processors.cache.query.GridCacheQueryFutureAdapter.next(GridCacheQueryFutureAdapter.java:161)
at
org.apache.ignite.internal.processors.cache.query.GridCacheDistributedQueryManager$5.onHasNext(GridCacheDistributedQueryManager.java:635)
at
org.apache.ignite.internal.util.GridCloseableIteratorAdapter.hasNextX(GridCloseableIteratorAdapter.java:53)
at
org.apache.ignite.internal.processors.cache.IgniteCacheProxy$1$1.onHasNext(IgniteCacheProxy.java:579)
at
org.apache.ignite.internal.util.GridCloseableIteratorAdapter.hasNextX(GridCloseableIteratorAdapter.java:53)
at
org.apache.ignite.internal.util.lang.GridIteratorAdapter.hasNext(GridIteratorAdapter.java:45)
at
org.apache.ignite.internal.processors.cache.QueryCursorImpl.getAll(QueryCursorImpl.java:114)


Ease of viewing code 

GridCacheQueryFutureAdapter.java 
 
private Iterator internalIterator() throws IgniteCheckedException {
checkError();

Iterator it = null;

while (it == null || !it.hasNext()) {
Collection c;

synchronized (this) {
it = iter;

if (it != null && it.hasNext())
break;

c = queue.poll();

if (c != null)
it = iter = c.iterator();

if (isDone() && queue.peek() == null)
break;
}

 
 
if (c == null && !isDone()) {
loadPage();

long timeout = qry.query().timeout();

long waitTime = timeout == 0 ? Long.MAX_VALUE : timeout -
(U.currentTimeMillis() - startTime);  

if (waitTime <= 0) {
it = Collections.emptyList().iterator();

break;
}
 
synchronized (this) {
try {
if (queue.isEmpty() && !isDone())
   * wait(waitTime);   *
  
//line 304 
}
catch (InterruptedException e) {
Thread.currentThread().interrupt();

throw new IgniteCheckedException("Query was interrupted: " + qry,
e);
}
}






--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Re: Semaphore Stuck when no acquirers to assign permit

2017-12-04 Thread Timay
>From what i found, it looks like the DataStructuresProcessor EventListener
get invoked after the dsMap has been cleared which prevents the
onNodeRemoved from being invoked. I created a pull request which will invoke
the onNodeRemove from the stop method. Also added my test to the data
structure test suite. 

Please take a look, and let me know what your thoughts are on it. 

pull request: https://github.com/apache/ignite/pull/3138
jira: https://issues.apache.org/jira/browse/IGNITE-7090

Tim



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Semaphore Stuck when no acquirers to assign permit

2017-12-01 Thread Timay
Hey all, 

We experienced an issue when trying to establish a semaphore after a single
instanced client node goes down hard (kill -9). Which afterwards we cannot
acquire a permit on the existing semaphore. However, if the client is
redundant, the permit is transferred successfully. 

I created a modified test of the SemaphoreFailoverSafeReleasePermitsTest,
which will close the initial semephore ignite instance then try and acquire
a permit and fail. 

SemaphoreFailoverNoWaitingAcquirerTest.java

  

I created a jira (https://issues.apache.org/jira/browse/IGNITE-7090) to
track as well, may try and dig further but wanted to get it out to the
group. 

Tim



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Re: Client Near Cache Configuration Lost after Cluster Node Removed

2017-10-24 Thread Timay
I believe i had the same issue, i have posted a test and my finding to the
user group. Which can be found here. 

http://apache-ignite-users.70518.x6.nabble.com/Near-Cache-Topoolgy-change-causes-NearCache-to-always-miss-td17539.html



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Re: Near Cache Topoolgy change causes NearCache to always miss.

2017-10-24 Thread Timay
Hey Slave, 

Just some more details, it looks like the GridNearCacheEntry.primaryNode is
the suspect. That is what updates the topVer. If you create 2 nodes, then X
clients. Using one of the clients to create a cache with a near cache config
it seems to work as expected. However, if you create a cache on the node
instance then populate that cache, but try to get from a near cache created
through the client the change of a topology will cause the setting of the
topVer to none, then the primaryNode never get to reset the topVer causing
the 
GridNearCacheEntry.valid to return false. 

Hope that makes some sense, but attached is a test i used. It's crude but
should at least show what i am trying to convey. The good test  to what is
expected with a miss count being the initial call and the one after the
invalidation of the topology change, the bad has almost all read marked as
misses which causes the hard hit to the cluster. 

GridCacheNearClientMissTest.java

  

Thanks
Tim(ay)



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Near Cache Topoolgy change causes NearCache to always miss.

2017-10-17 Thread Timay
Hey all,


Versions: 2.1 & 2.3-SNAPSHOT
Setup: 1 client, 1+ servers


I found an issue with Near Caching. When a topology change occurs the
GridNearCacheEntry.valid method will update the GridNearCacheEntry.topVer to
None. This version then never gets "reset". Causing the state of the
GridNearCacheEntry.valid to always be false because the None check is done
before the update of the version. This will cause all future request to miss
the near cache and make a remote call to the cluster

>From what i can see, the the AffinityTopologyVersion get set in a few spots
so i was unsure where to patch this. For testing i added a compare check to
the GridNearCacheEntry.loadedValue which set the topVer to the latest.


This is fairly large issue for us, as the performance goes from <1
millisecond to +40 milliseconds.

Let me know if you have questions.






--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/