Looks like the conversation has moved to the stackoverflow.
Continuing the conversation on stackoverflow.

Regards
Nabarun Nag

On Tue, Jan 14, 2020 at 9:23 AM Nabarun Nag <[email protected]> wrote:

> Hi David,
> We have started looking into it and get you an answer soon.
>
> Regards
> Naba
>
> On Tue, Jan 14, 2020 at 7:48 AM David Loewy <[email protected]>
> wrote:
>
>> Hello!
>>
>>
>>
>> My team uses Geode as a makeshift analytics engine. We store a collection
>> of massive raw data objects (200MB+ each) in Geode, but these objects are
>> never directly returned to the client. Instead, we rely heavily on custom
>> function execution to process these data sets inside Geode, and only return
>> the analysis result set.
>>
>>
>>
>> We have a new requirement to implement two tiers of data analytics
>> precision. The high-precision analytics will require larger raw data sets
>> and more CPU time. It is imperative that these high-precision analyses do
>> not inhibit the low-precision analytics performance in any way. As such,
>> I'm looking for a solution that keeps these data sets isolated to different
>> servers.
>>
>>
>>
>> I built a POC that keeps each data set in its own region (both are
>> PARTITIONED). These regions are configured to belong to separate Member
>> Groups, then each server is configured to join one of the two groups. I'm
>> able to stand up this cluster locally without issue, and gfsh indicates
>> that everything looks correct: `describe member` shows each member hosting
>> the expected regions.
>>
>>
>>
>> My client code configures a ClientCache that points at the cluster's
>> single locator. My function execution command generally looks like the
>> following:
>>
>>
>>
>> FunctionService
>>
>>   .onRegion(highPrecisionRegion)
>>
>>   .setArguments(inputObject)
>>
>>   .filter(keySet)
>>
>>   .execute(function);
>>
>>
>>
>> When I only run the high-precision server, I'm able to execute the
>> function against the high-precision region. When I only run the
>> low-precision server, I'm able to execute the function against the
>> low-precision region. However, when I run both servers and execute the
>> functions one after the other, I invariably get an exception stating that
>> *one* of the regions cannot be found. See the following Gist for a sample
>> of my code and the exception.
>>
>> https://gist.github.com/dLoewy/c9f695d67f77ec18a7e60a25c4e62b01
>>
>>
>>
>> TLDR key points:
>>
>> 1) Using member groups, Region A is on Server 1 and Region B is on Server
>> 2.
>>
>> 2) These regions must be PARTITIONED in Production.
>>
>> 3) I need to run a *data-dependent* function on one of these regions; The
>> client code chooses which.
>>
>> 4) As-is, my client code always fails to find *one* of the regions.
>>
>>
>>
>> Can someone please help me get on track? Is there an entirely different
>> cluster architecture I should be considering? Happy to provide more detail
>> upon request.
>>
>>
>>
>> Thanks so much for your time!
>>
>>
>>
>> David
>>
>>
>>
>> FYI, the following docs pages mention function execution on Member
>> Groups, but give very little detail. The first link describes running
>> data-INdependent functions on member groups, but doesn't say how, and
>> doesn't say anything about running data-DEpendent functions on member
>> groups.
>>
>>
>> https://gemfire.docs.pivotal.io/99/geode/developing/function_exec/how_function_execution_works.html
>>
>>
>> https://gemfire.docs.pivotal.io/99/geode/developing/function_exec/function_execution.html
>>
>

Reply via email to