Re: Ignite Critical Failure -- Compound exception for CountDownFuture
So I tried without the docker binding to OSX filesystem and it worked fine. Looks like compression (sparse file) does not work with binding to APFS. Thanks! -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/
Re: Ignite Critical Failure -- Compound exception for CountDownFuture
I am testing using a Docker image based on "apacheignite/ignite:2.10.0", the linux info: "Linux 35d4d7814e94 5.10.25-linuxkit #1 SMP Tue Mar 23 09:27:39 UTC 2021 x86_64 Linux" The docker is running in my Mac OSX's Docker Desktop. I am calling docker run using "-v ${PWD}/work_dir_perf:/persistence" option so it is binding to the OSX's filesystem which is APFS. -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/
Ignite Critical Failure -- Compound exception for CountDownFuture
Hi, I am running into an Ignite (Ignite ver. 2.10.0) critical failure triggered by high write load. This is the error summary: [04:11:11,605][SEVERE][db-checkpoint-thread-#72][] JVM will be halted immediately due to the failure: [failureCtx=FailureContext [type=CRITICAL_ERROR, err=class o.a.i.IgniteCheckedException: Compound exception for CountDownFuture.]] The more detailed exception: [04:11:11,435][INFO][db-checkpoint-thread-#72][Checkpointer] Checkpoint started [checkpointId=251fa396-1611-416f-a569-c93c1e8f6c84, startPtr=WALPointer [idx=8, fileOff=13437451, len=40871], checkpointBeforeLockTime=193ms, checkpointLockWait=6ms, checkpointListenersExecuteTime=33ms, checkpointLockHoldTime=45ms, walCpRecordFsyncDuration=16ms, writeCheckpointEntryDuration=17ms, splitAndSortCpPagesDuration=109ms, pages=76628, reason='too big size of WAL without checkpoint'] [04:11:11,470][SEVERE][db-checkpoint-thread-#72][] Critical system error detected. Will be handled accordingly to configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext [type=CRITICAL_ERROR, err=class o.a.i.IgniteCheckedException: Compound exception for CountDownFuture.]] class org.apache.ignite.IgniteCheckedException: Compound exception for CountDownFuture. at org.apache.ignite.internal.util.future.CountDownFuture.addError(CountDownFuture.java:72) at org.apache.ignite.internal.util.future.CountDownFuture.onDone(CountDownFuture.java:46) at org.apache.ignite.internal.util.future.CountDownFuture.onDone(CountDownFuture.java:28) at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:478) at org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointPagesWriter.run(CheckpointPagesWriter.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Suppressed: class org.apache.ignite.IgniteException: errno: -1 at org.apache.ignite.internal.processors.compress.NativeFileSystemLinux.punchHole(NativeFileSystemLinux.java:122) at org.apache.ignite.internal.processors.compress.FileSystemUtils.punchHole(FileSystemUtils.java:125) at org.apache.ignite.internal.processors.cache.persistence.file.AsyncFileIO.punchHole(AsyncFileIO.java:93) Some background of what I am doing: * I am using data streamer to write ~1GB of data into a single Ignite node (laptop) with persistence enabled. Everything was working fine until I enabled disk compression (zstd level 3, 8KB page size). After I enabled disk compression I get the above exception. * I tried enabling/disabling writeThrottlingEnabled but it did not help. * I turned WAL archive off and it did not help. * I increased checkpointPageBufferSize from default 256MB to 1GB and that delayed the exception until further into the upload but the exception still throws eventually. -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/
IgniteCheckedException: Requesting mapping from grid failed for [platformId=0, typeId=-1220482121]
Hi, I used C# code (entity object class) to create and write to cache. And I am trying to use Java code to read (corresponding object class) from cache but running into IgniteCheckedException: Is this scenario supported? Here's the C# entity class: Here's the corresponding Java class: I am enable to write to the cache from the Java side and then read from it. The object written from the Java side does not show up in the SQL queries (cache/table was created using the C# entity class). -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/
Re: Designing Affinity Key for more locality
I am using a user centric modeling approach where most of the computations would be on a per-user basis (joins) before aggregation. The idea is to put the data (across different tables/caches) for the same user in the same partition/server. That's the reason why I chose user-id as the affinity key. Using tenant/group as the affinity key is not good scalability. Some tenant/group dataset might be too large for one partition/server (we are using persistence mode). Even if it does fit, it would not benefit from load balancing of the computation across the servers. -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/
Re: Designing Affinity Key for more locality
Came across this statement in the Data Partitioning documents: "The affinity function determines the mapping between keys and partitions. Each partition is identified by a number from a limited set (0 to 1023 by default)." Looks like there is no point for adding another layer of mapping unless I am going for a smaller number. Are there other ways in ignite to get more locality for subset of the data? -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/
Designing Affinity Key for more locality
Hi, I am currently using user-id as the affinity key and because it is a uuid/string, it will distribute across all partitions in the cluster. However, for my scenario, I am writing and reading users that are for the same tenant/group so it seems to me that it is better to design with more read/write locality within a partition. One approach I am thinking about is to hash the tenant-id + user-id into 1024 integer values (think of it as logical buckets) which I will use as the affinity key. This way I can still get the colocation of user data while also getting more locality within a partition. Question is whether there are some negative trade-off in Ignite with using this approach? Note, I am also using Ignite SQL so I plan to set this integer as a SQL field so that I can do colocation join on the user. Is this even necessary if distributed join is disabled? Thanks. -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/
C# DataStreamer Best Practices
Hi, We are hitting a performance wall with using the C# thin client PutAsync for data upload to Ignite. We are in the process of migrating to C#'s thick client (using client mode) datastreamer. I would appreciate some advice on some best practices on how to use the thick client: * Ignite instance -- should we use a single instance within the server process. It seems like C# is a wrapper around JVM so there is no point to use multiple instances? * DataStreamer - I am aware from some posting that it is thread-safe, my question is whether there are any benefits in trying to share/reuse DataStreamer instance (e.g. better batching for colocated data) vs using more DataStreamer instances (more parallelism/connections)? * Retries and connection failures - there is no mention about connection failure scenarios or retry setting. Can I assume the DataStreamer (via JVM) will take care of all the connection failures and reconnecting? * Failures - are these surfaced as exceptions from Task returned by AddData()? Or do I have to use Flush/Close? Thanks -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/
Understanding SQL join performance
Hi, I am trying to understand why my colocated join between two tables/caches are taking so long compare to the individual table filters. TABLE1 Returns 1 count -- 0.13s TABLE2 Returns 65000 count -- 0.643s JOIN TABLE1 and TABLE2 Returns 650K count -- 7s Both analysis_input and analysis_output has index on (cohort_id, user_id, timestamp). The affinity key is user_id. How do I analyze the performance further? Here's the explain which does not tell me much: Is Ignite doing the join and filtering at each data node and then sending the 650K total rows to the reduce before aggregation? If so, is it possible for Ignite to do the some aggregation at the data node first and then send the first level aggregation results to the reducer? -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/
Re: MapReduce - How to efficiently scan through subset of the caches?
Thanks for the pointers stephendarlington, ptupitsyn. Looks like I can run a mapper that does a local SQL query to get the set of keys for the tenant (that resides on the local server node), and then do Compute.affinityRun or Cache.invokeAll. For Cache.invokeAll, it takes a dictionary of keys to EntryProcessor so that is easy to understand. For Compute.affinityRun, I am not sure how to work with it for my scenario: * It takes an affinity key to find the partition's server to run the IgniteRunnable but I don't see an interface to pass in the specific keys? Am I expected to pass the key set as part of IgniteRunnable object? * Suppose the cache use user_id as the affinity key then it is possible that 2 user_id will map to the same partition. How do I avoid duplicate processing/scanning? -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/
MapReduce - How to efficiently scan through subset of the caches?
Hi, I am investigating whether the MapReduce API is the right tool for my scenario. Here's the context of the caches: * Multiple caches for different type of dataset * Each cache has multi-tenant data and the tenant id is part of the cache key * Each cache entry is a complex json/binary object that I want to do computation on (let's just say it is hard to do it in SQL) and return some complex results for each entry (e.g. a dictionary) that I want to do reduce/aggregation on. * The cluster is persistence enabled because we have more data then memory My scenario is to do the MapReduce operation only on data for a specific tenant (small subset of the data). From reading the forum about MapReduce, it seems like the best way to do this is using the IgniteCache.localEntries API and iterate through the node's local cache. My concern with this approach is that we are looping through the whole cache (K) which is very inefficient. Is there a more efficient way to filter only the relevant keys and then access the matching entries only? Thanks. -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/