Re: Question about INDEX_THRESHOLD_SIZE

2022-03-10 Thread Anilkumar Gingade
Mario,

There are few changes happened around this area as part of GEODE-9632 fix; can 
you please revert that change and see if the query works both with and without 
index. 
Looking at the code; it seems to restrict the number index look up that needs 
to be performed; certain latency/throughput sensitive queries that or not 
expecting exact result may use this (my guess) but by default it should not be 
resulting in unexpected results.

-Anil.


On 3/10/22, 6:50 AM, "Mario Kevo"  wrote:

Hi geode-dev,

Some time ago I was working on allowing INDEX_THRESHOLD_SIZE System 
property to override CompiledValue.RESULT_LIMIT.
After this change, adding this attribute will take into a count if you set 
it.
But I need some clarification of this INDEX_THRESHOLD_SIZE attribute. Why 
is this set by default to 100?
The main problem with this attribute is that if you want to get the correct 
result, you need to know how many entries will be in the region while starting 
servers and set it on that value or higher. Sometimes it is too hard to know 
how many entries will be in the region, so maybe better will be to set it by 
default to some higher number, something like Integer.MAX_VALUE.

Where this attribute is used?
It is used to get index results while doing queries.

What is the problem?
If we have INDEX_THRESHOLD_SIZE set to 500, and we have 1k entries it can 
happen that while doing a query it will get only 500 entries and where clause 
cannot be fulfilled and we got no results.
Let's see it by an example!

We have only one entry that matches the condition from the query, 
INDEX_THRESHOLD_SIZE set to 500, and 1k entries in the region.
If we run the query without an index we got the result.
gfsh>query --query="SELECT e.key, e.value from 
/example-region.entrySet e where e.value.positions['SUN'] like 'someth%'"
Result  : true
Limit   : 100
Rows: 1
Query Trace : Query Executed in 10.750238 ms; indexesUsed(0)

key | value
--- | 

700 | 
{"ID":700,"indexKey":0,"pkid":"700","shortID":null,"position1":{"mktValue":1945.0,"secId":"ORCL","secIdIndexed":"ORCL","secType":null,"sharesOutstanding":1944000.0,"underlyer":null,"pid":1944,"portfolioId":700,..
​If we create an index and then run again this query there is no result.
gfsh>query --query="SELECT e.key, e.value from 
/example-region.entrySet e where e.value.positions['SUN'] like 'someth%'"
Result  : true
Limit   : 100
Rows: 0
Query Trace : Query Executed in 22.079016 ms; 
indexesUsed(1):index1(Results: 500)
​This happened because we have no luck getting that entry that matches the 
condition in the intermediate results for the index.
So the questions are:
What if more entries enter the region that will make the index return more 
entries than this threshold sets? Then we're again in jeopardy that the query 
condition will not match.
Why is this attribute set by default to 100?
Can we change it to the Integer.MAX_VALUE by default to be sure that we 
have the correct result? What are the consequences?

BR,
Mario




[ANNOUNCE] Apache Geode 1.12.9

2022-03-10 Thread Dick Cavender
The Apache Geode community is pleased to announce the availability of
Apache Geode 1.12.9.

Geode is a data management platform that provides a database-like consistency
model, reliable transaction processing and a shared-nothing architecture
to maintain very low latency performance with high concurrency processing.

Apache Geode 1.12.9 contains a number of bug fixes.
Users are encouraged to upgrade to the latest 1.14.x release (currently 1.14.3).
For the full list of changes please review the release notes at:
https://cwiki.apache.org/confluence/display/GEODE/Release+Notes#ReleaseNotes-1.12.9

Release artifacts and documentation can be found at the project website:
https://geode.apache.org/releases/
https://geode.apache.org/docs/guide/112/about_geode.html

We would like to thank all the contributors that made the release possible.

Regards,

Dick Cavender on behalf of the Apache Geode team


Question about INDEX_THRESHOLD_SIZE

2022-03-10 Thread Mario Kevo
Hi geode-dev,

Some time ago I was working on allowing INDEX_THRESHOLD_SIZE System property to 
override CompiledValue.RESULT_LIMIT.
After this change, adding this attribute will take into a count if you set it.
But I need some clarification of this INDEX_THRESHOLD_SIZE attribute. Why is 
this set by default to 100?
The main problem with this attribute is that if you want to get the correct 
result, you need to know how many entries will be in the region while starting 
servers and set it on that value or higher. Sometimes it is too hard to know 
how many entries will be in the region, so maybe better will be to set it by 
default to some higher number, something like Integer.MAX_VALUE.

Where this attribute is used?
It is used to get index results while doing queries.

What is the problem?
If we have INDEX_THRESHOLD_SIZE set to 500, and we have 1k entries it can 
happen that while doing a query it will get only 500 entries and where clause 
cannot be fulfilled and we got no results.
Let's see it by an example!

We have only one entry that matches the condition from the query, 
INDEX_THRESHOLD_SIZE set to 500, and 1k entries in the region.
If we run the query without an index we got the result.
gfsh>query --query="SELECT e.key, e.value from /example-region.entrySet 
e where e.value.positions['SUN'] like 'someth%'"
Result  : true
Limit   : 100
Rows: 1
Query Trace : Query Executed in 10.750238 ms; indexesUsed(0)

key | value
--- | 

700 | 
{"ID":700,"indexKey":0,"pkid":"700","shortID":null,"position1":{"mktValue":1945.0,"secId":"ORCL","secIdIndexed":"ORCL","secType":null,"sharesOutstanding":1944000.0,"underlyer":null,"pid":1944,"portfolioId":700,..
​If we create an index and then run again this query there is no result.
gfsh>query --query="SELECT e.key, e.value from /example-region.entrySet 
e where e.value.positions['SUN'] like 'someth%'"
Result  : true
Limit   : 100
Rows: 0
Query Trace : Query Executed in 22.079016 ms; indexesUsed(1):index1(Results: 
500)
​This happened because we have no luck getting that entry that matches the 
condition in the intermediate results for the index.
So the questions are:
What if more entries enter the region that will make the index return more 
entries than this threshold sets? Then we're again in jeopardy that the query 
condition will not match.
Why is this attribute set by default to 100?
Can we change it to the Integer.MAX_VALUE by default to be sure that we have 
the correct result? What are the consequences?

BR,
Mario



Question related to gateway-receivers connection load balancing

2022-03-10 Thread Jakov Varenina

Hi devs,

We have observed some weird behavior related to load balancing of 
gateway-receivers connections in the geode cluster. Currently, 
gateway-receiver connection load is only updated on coordinator locator 
when it provides server location to remote gateway-sender in 
ClientConnectionRequest{group=__recv_group...}/ClientConnectionResponse 
messages exchange. Other locators never update gateway-receiver 
connection load, since they are not handling these messages. 
Additionally, locators (including the coordinator) ignore 
CacheServerLoadMessage messages that are carrying the receiver's 
connection load. This means that locators will not adjust the load when 
the connection on some receiver is shut down.


Is this expected behavior or this is a bug?

You can find more information in this PR:

https://github.com/apache/geode/pull/7378#issuecomment-1048513322

and ticket:

https://issues.apache.org/jira/browse/GEODE-10056

Thanks,

Jakov