Re: Ignite Critical Failure -- Compound exception for CountDownFuture

2021-05-28 Thread William.L
So I tried without the docker binding to OSX filesystem and it worked fine.
Looks like compression (sparse file) does not work with binding to APFS.

Thanks!



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Re: Ignite Critical Failure -- Compound exception for CountDownFuture

2021-05-28 Thread William.L
I am testing using a Docker image based on "apacheignite/ignite:2.10.0", the
linux info: "Linux 35d4d7814e94 5.10.25-linuxkit #1 SMP Tue Mar 23 09:27:39
UTC 2021 x86_64 Linux"

The docker is running in my Mac OSX's Docker Desktop. I am calling docker
run using "-v ${PWD}/work_dir_perf:/persistence" option so it is binding to
the OSX's filesystem which is APFS.









--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Ignite Critical Failure -- Compound exception for CountDownFuture

2021-05-27 Thread William.L
Hi,

I am running into an Ignite (Ignite ver. 2.10.0) critical failure triggered
by high write load. This is the error summary:

[04:11:11,605][SEVERE][db-checkpoint-thread-#72][] JVM will be halted
immediately due to the failure: [failureCtx=FailureContext
[type=CRITICAL_ERROR, err=class o.a.i.IgniteCheckedException: Compound
exception for CountDownFuture.]]

The more detailed exception:
[04:11:11,435][INFO][db-checkpoint-thread-#72][Checkpointer] Checkpoint
started [checkpointId=251fa396-1611-416f-a569-c93c1e8f6c84,
startPtr=WALPointer [idx=8, fileOff=13437451, len=40871],
checkpointBeforeLockTime=193ms, checkpointLockWait=6ms,
checkpointListenersExecuteTime=33ms, checkpointLockHoldTime=45ms,
walCpRecordFsyncDuration=16ms, writeCheckpointEntryDuration=17ms,
splitAndSortCpPagesDuration=109ms, pages=76628, reason='too big size of WAL
without checkpoint']
[04:11:11,470][SEVERE][db-checkpoint-thread-#72][] Critical system error
detected. Will be handled accordingly to configured handler
[hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0,
super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet
[SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]],
failureCtx=FailureContext [type=CRITICAL_ERROR, err=class
o.a.i.IgniteCheckedException: Compound exception for CountDownFuture.]]
class org.apache.ignite.IgniteCheckedException: Compound exception for
CountDownFuture.
at
org.apache.ignite.internal.util.future.CountDownFuture.addError(CountDownFuture.java:72)
at
org.apache.ignite.internal.util.future.CountDownFuture.onDone(CountDownFuture.java:46)
at
org.apache.ignite.internal.util.future.CountDownFuture.onDone(CountDownFuture.java:28)
at
org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:478)
at
org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointPagesWriter.run(CheckpointPagesWriter.java:166)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Suppressed: class org.apache.ignite.IgniteException: errno: -1
at
org.apache.ignite.internal.processors.compress.NativeFileSystemLinux.punchHole(NativeFileSystemLinux.java:122)
at
org.apache.ignite.internal.processors.compress.FileSystemUtils.punchHole(FileSystemUtils.java:125)
at
org.apache.ignite.internal.processors.cache.persistence.file.AsyncFileIO.punchHole(AsyncFileIO.java:93)


Some background of what I am doing:
* I am using data streamer to write ~1GB of data into a single Ignite node
(laptop) with persistence enabled. Everything was working fine until I
enabled disk compression (zstd level 3, 8KB page size). After I enabled disk
compression I get the above exception. 
* I tried enabling/disabling writeThrottlingEnabled but it did not help. 
* I turned WAL archive off and it did not help.
* I increased checkpointPageBufferSize from default 256MB to 1GB and that
delayed the exception until further into the upload but the exception still
throws eventually.







--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


IgniteCheckedException: Requesting mapping from grid failed for [platformId=0, typeId=-1220482121]

2021-05-03 Thread William.L
Hi,

I used C# code (entity object class) to create and write to cache. And I am
trying to use Java code to read (corresponding object class) from cache but
running into IgniteCheckedException: 



Is this scenario supported?

Here's the C# entity class:


Here's the corresponding Java class:


I am enable to write to the cache from the Java side and then read from it.
The object written from the Java side does not show up in the SQL queries
(cache/table was created using the C# entity class).








--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Re: Designing Affinity Key for more locality

2021-04-29 Thread William.L
I am using a user centric modeling approach where most of the computations
would be on a per-user basis (joins) before aggregation. The idea is to put
the data (across different tables/caches) for the same user in the same
partition/server. That's the reason why I chose user-id as the affinity key.

Using tenant/group as the affinity key is not good scalability. Some
tenant/group dataset might be too large for one partition/server (we are
using persistence mode). Even if it does fit, it would not benefit from load
balancing of the computation across the servers.








--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Re: Designing Affinity Key for more locality

2021-04-26 Thread William.L
Came across this statement in the Data Partitioning documents:

"The affinity function determines the mapping between keys and partitions.
Each partition is identified by a number from a limited set (0 to 1023 by
default)."

Looks like there is no point for adding another layer of mapping unless I am
going for a smaller number.
Are there other ways in ignite to get more locality for subset of the data?
 



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Designing Affinity Key for more locality

2021-04-26 Thread William.L
Hi,

I am currently using user-id as the affinity key and because it is a
uuid/string, it will distribute across all partitions in the cluster.
However, for my scenario, I am writing and reading users that are for the
same tenant/group so it seems to me that it is better to design with more
read/write locality within a partition.

One approach I am thinking about is to hash the tenant-id + user-id into
1024 integer values (think of it as logical buckets) which I will use as the
affinity key. This way I can still get the colocation of user data while
also getting more locality within a partition.

Question is whether there are some negative trade-off in Ignite with using
this approach?

Note, I am also using Ignite SQL so I plan to set this integer as a SQL
field so that I can do colocation join on the user. Is this even necessary
if distributed join is disabled?

Thanks.






--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


C# DataStreamer Best Practices

2021-04-23 Thread William.L
Hi,

We are hitting a performance wall with using the C# thin client PutAsync for
data upload to Ignite. We are in the process of migrating to C#'s thick
client (using client mode) datastreamer. I would appreciate some advice on
some best practices on how to use the thick client:
* Ignite instance -- should we use a single instance within the server
process. It seems like C# is a wrapper around JVM so there is no point to
use multiple instances?
* DataStreamer - I am aware from some posting that it is thread-safe, my
question is whether there are any benefits in trying to share/reuse
DataStreamer instance (e.g. better batching for colocated data) vs using
more DataStreamer instances (more parallelism/connections)?
* Retries and connection failures - there is no mention about connection
failure scenarios or retry setting. Can I assume the DataStreamer (via JVM)
will take care of all the connection failures and reconnecting? 
* Failures - are these surfaced as exceptions from Task returned by
AddData()? Or do I have to use Flush/Close?

Thanks





--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Understanding SQL join performance

2021-04-23 Thread William.L
Hi,

I am trying to understand why my colocated join between two tables/caches
are taking so long compare to the individual table filters.

TABLE1

Returns 1 count -- 0.13s

TABLE2

Returns 65000 count -- 0.643s


 JOIN TABLE1 and TABLE2

Returns 650K count -- 7s

Both analysis_input and analysis_output has index on (cohort_id, user_id,
timestamp). The affinity key is user_id. How do I analyze the performance
further?

Here's the explain which does not tell me much:



Is Ignite doing the join and filtering at each data node and then sending
the 650K total rows to the reduce before aggregation? If so, is it possible
for Ignite to do the some aggregation at the data node first and then send
the first level aggregation results to the reducer?






--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Re: MapReduce - How to efficiently scan through subset of the caches?

2021-04-23 Thread William.L
Thanks for the pointers stephendarlington, ptupitsyn.

Looks like I can run a mapper that does a local SQL query to get the set of
keys for the tenant (that resides on the local server node), and then do
Compute.affinityRun or Cache.invokeAll.

For Cache.invokeAll, it takes a dictionary of keys to EntryProcessor so that
is easy to understand.

For Compute.affinityRun, I am not sure how to work with it for my scenario:
* It takes an affinity key to find the partition's server to run the
IgniteRunnable but I don't see an interface to pass in the specific keys? Am
I expected to pass the key set as part of IgniteRunnable object? 
* Suppose the cache use user_id as the affinity key then it is possible that
2 user_id will map to the same partition. How do I avoid duplicate
processing/scanning?





--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


MapReduce - How to efficiently scan through subset of the caches?

2021-04-23 Thread William.L
Hi,

I am investigating whether the MapReduce API is the right tool for my
scenario. Here's the context of the caches:
* Multiple caches for different type of dataset
* Each cache has multi-tenant data and the tenant id is part of the cache
key
* Each cache entry is a complex json/binary object that I want to do
computation on (let's just say it is hard to do it in SQL) and return some
complex results for each entry (e.g. a dictionary) that I want to do
reduce/aggregation on.
* The cluster is persistence enabled because we have more data then memory

My scenario is to do the MapReduce operation only on data for a specific
tenant (small subset of the data). From reading the forum about MapReduce,
it seems like the best way to do this is using the IgniteCache.localEntries
API and iterate through the node's local cache. My concern with this
approach is that we are looping through the whole cache (K) which is very
inefficient. Is there a more efficient way to filter only the relevant keys
and then access the matching entries only?

Thanks.




--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/