Thanks Vlad. So it's like I define a custom class for my use case(Basically a logic view of the table and column) like this: https://github.com/apache/ignite/blob/master/examples/src/main/java/org/apache/ignite/examples/model/Person.java. And then I create Ignite cache with type <K, Person> and populate it with constructed person object. After that I could directly use something like "select * from person where age > 33" to get subset of it right? But issue is my original data format is dataframe. If I am doing this way, does that mean I have to manually parse the dataframe and use underlying data construct Person object? Is this the only way to enable subset/filter pushdown(Could I use spark jdbc API that directly do that mapping for me: JdbcUtils.saveTable(personDF, url, table, props))?
From: [email protected] Subject: Re: Is it possible to enable both REPLICATED and PARTITIONED? Hi Tracy, Ignite support SQLFieldQuery for the purpose[1] SQL with default marshaller (Binary) will be use only needed fields when evaluation. [1]: https://apacheignite.readme.io/docs/sql-queries#fields-queries On Mon, Oct 10, 2016 at 8:54 PM, Tracy Liang (BLOOMBERG/ 731 LEX) <[email protected]> wrote: Thanks for this clear explanation, Alexey. Basically I want to use Ignite as a shared in-memory layer among multiple Spark Server instances. Also I have another question: does ignite cache support predicate pushdown or a logic view of cache? For example, I only want certain column of the value instead of returning the entire universe. How do I do that? From: [email protected] Subject: Re: Is it possible to enable both REPLICATED and PARTITIONED? Tracy, First of all, cache mode and number of backups could be set only once - on cache start. So, if you know the size of your cluster you could set number of backups before cache start. But, I think it is not reasonable to set number of backups equals to number of nodes. If you need 100% high availability, just use replicated cache. But I would recommend to think about how many nodes at once can be lost? May be it is reasonable to set backups = 2? The more backups you choose - the more memory will be consumed by backup partitions and also grid will spend time in rebalancing data. What is your use case? On Mon, Oct 10, 2016 at 11:18 PM, Tracy Liang (BLOOMBERG/ 731 LEX) <[email protected]> wrote: Thanks, and PARTITIONED mode could have any number of backups right? I want backups for high availability and also my dataset is large. I guess I will use PARTITIONED mode and configure number of backups based on actual needs in that case right? From: [email protected] Subject: Re: Is it possible to enable both REPLICATED and PARTITIONED? Hi, Tracyl. Actually, REPLICATED cache is a PARTITIONED cache win backups on all nodes. But, why did you need this? On Mon, Oct 10, 2016 at 10:46 AM, Tracyl <[email protected]> wrote: As subject shows. -- View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Is-it-possible-to-enable-both-REPLICATED-and-PARTITIONED-tp8167.html Sent from the Apache Ignite Users mailing list archive at Nabble.com. -- Alexey Kuznetsov -- Alexey Kuznetsov -- Vladislav Pyatkov
