Populate ddl (e.g., create table) to third party persistence

2020-01-16 Thread crypto_ricklee
Hi all,

I'm using Postgres as the third party persistence for Ignite 2.7.6. Just
want to know if I create a table on ignite cluster, is there any way to
populate the schema back to the Postgres automatically?

Thanks!



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Could not create .NET CacheStore

2020-01-16 Thread siva
I am using .net core client and server Apache Ignite v2.7.6.

After started thick client node,from thick client trying to get cache

client code:
-
 public IIgnite StartIgniteClient()
{
try
{
if (_IGNITE_CLIENT == null)
{
Ignition.ClientMode = true;
   _IGNITE_CLIENT = Ignition.Start(GetIgniteConfiguration());
}

}
catch (Exception ex)
{
throw;
}

return _IGNITE_CLIENT;
}

public IgniteConfiguration GetIgniteConfiguration()
{
IgniteConfiguration config = null;
try
{
config = new IgniteConfiguration
{

IgniteInstanceName = "EntityActorIgniteClientNode",

DiscoverySpi = new TcpDiscoverySpi
{
IpFinder = new TcpDiscoveryStaticIpFinder
{
Endpoints = _endPoints//new[] { "localhost"
}//127.0.0.1 or ip
},
SocketTimeout = TimeSpan.FromSeconds(3000)
},
DataStorageConfiguration = new
DataStorageConfiguration()
{
DefaultDataRegionConfiguration = new
DataRegionConfiguration()
{
Name = "IgniteDataRegion",
PersistenceEnabled = true
},
StoragePath = "C:\\client\\storage",
WalPath = "C:\\client\\wal",
WalArchivePath = "C:\\client\\walArchive"
},
WorkDirectory = "C:\\client\\work",
// Explicitly configure TCP communication SPI by
changing local port number for
// the nodes from the first cluster.
CommunicationSpi = new TcpCommunicationSpi()
{
LocalPort = 47100
},
PeerAssemblyLoadingMode =
Apache.Ignite.Core.Deployment.PeerAssemblyLoadingMode.CurrentAppDomain

};
}
catch (Exception ex)
{
throw;
}
return config;
}


*var cache = _IGNITE_CLIENT.GetCache(cacheName);*//throwing
exception

*Exception*:
---
javax.cache.CacheException: class org.apache.ignite.IgniteCheckedException:
Could not create .NET CacheStore
at
org.apache.ignite.internal.processors.cache.GridCacheUtils.convertToCacheException(GridCacheUtils.java:1337)
at org.apache.ignite.internal.IgniteKernal.cache(IgniteKernal.java:2905)
at
org.apache.ignite.internal.processors.platform.PlatformProcessorImpl.processInStreamOutObject(PlatformProcessorImpl.java:526)
at
org.apache.ignite.internal.processors.platform.PlatformTargetProxyImpl.inStreamOutObject(PlatformTargetProxyImpl.java:79)
Caused by: class org.apache.ignite.IgniteCheckedException: Could not create
.NET CacheStore
at
org.apache.ignite.internal.processors.platform.dotnet.PlatformDotNetCacheStore.initialize(PlatformDotNetCacheStore.java:409)
at
org.apache.ignite.internal.processors.platform.PlatformProcessorImpl.registerStore0(PlatformProcessorImpl.java:746)
at
org.apache.ignite.internal.processors.platform.PlatformProcessorImpl.registerStore(PlatformProcessorImpl.java:324)
at
org.apache.ignite.internal.processors.cache.store.CacheOsStoreManager.start0(CacheOsStoreManager.java:60)
at
org.apache.ignite.internal.processors.cache.GridCacheManagerAdapter.start(GridCacheManagerAdapter.java:50)
at
org.apache.ignite.internal.processors.cache.GridCacheProcessor.startCache(GridCacheProcessor.java:1313)
at
org.apache.ignite.internal.processors.cache.GridCacheProcessor.prepareCacheStart(GridCacheProcessor.java:2172)
at
org.apache.ignite.internal.processors.cache.CacheAffinitySharedManager.processClientCacheStartRequests(CacheAffinitySharedManager.java:438)
at
org.apache.ignite.internal.processors.cache.CacheAffinitySharedManager.processClientCachesChanges(CacheAffinitySharedManager.java:637)
at
org.apache.ignite.internal.processors.cache.GridCacheProcessor.processCustomExchangeTask(GridCacheProcessor.java:391)
at
org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.processCustomTask(GridCachePartitionExchangeManager.java:2489)
at
org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body0(GridCachePartitionExchangeManager.java:2634)
at
org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:2553)
at
org.apache.ignite.internal.util.worke

How best to push continuously updating ordered data on Apache ignite cache

2020-01-16 Thread trans
*Usecase*

Here is the topology we are working on

*Server - 1* --> marketData cache, holds different share prices information

*Client - 1* --> Pushing data on the cache

*Client - 2* --> Continuous query, listening updates coming on marketData
cache per key basis

I want data to follow the order in which it was received and pushed on
queue. Reason is that client - 2 should not get old data. For example last
price data for an instrument moved from 100 to 101 then 102. Client - 2
should get in the same order and to **not** get in order like 100 to 102
then 101.
So for a key in my cache i want the messages to be pushed in order.

Ways of pushing data on cache:

 1. *put* seems to be safest way but looks like slow and full cache update
is happening and then thread moves to next statement. This might not be
suitable for pushing 2 updates per second.

 2. *putAsync *seems to be good way, my understanding is that it uses
striped pool to push data on the cache. As the  striped pool
  
uses max(8, no of cores) so it should be faster. But as the multiple threads
are being processed in parallel so does it confirm the data ordering as it
was pushed in?

 3. *DataStreamer *seems to be best way as it also processes things in
parallel but again the problem is with the order of putting data into cache. 
API documentation

  
also mention that ordering of data is not guaranteed.


Can someone please clarify above? I could not find document giving clear
deeper understanding for these ways.
*What is best way out of above to push continuously updating data like
market data?*



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Re: JDBC Connectivity

2020-01-16 Thread narges saleh
Thanks Stephan.

Can you send me an example where the cache and tables are entirely defined
in the XML configuration file (and no POJO), with query entity or just
JDBC? Let's assume that the sql codes run on a server node or a thick
client.


On Thu, Jan 16, 2020 at 8:02 AM Stephen Darlington <
stephen.darling...@gridgain.com> wrote:

> If you create a cache, either in code or XML, using the minimal list of
> parameter it won’t be accessible using SQL.
>
> There are a number of ways you can define what’s visible using SQL. You
> can use a POJO with the @QuerySqlField annotation (and the indexTypes
> property in the XML file) or define QueryEntities. See the documentarian:
> https://www.gridgain.com/docs/latest/developers-guide/SQL/indexes
>
> Whether you do it on the client or server side is a bit of a religious
> debate, but either works. The important thing is that the first definition
> to hit the cluster is the one that takes effect.
>
> The most common pattern I see with JDBC is the caches are defined server
> side, and clients connect using the *thin-client* driver. Thin clients
> just need a hostname and port.
>
> However, there is also a thick-client JDBC driver. The XML here is no
> different from any other node.
>
> Regards,
> Stephen
>
> On 16 Jan 2020, at 12:54, narges saleh  wrote:
>
> Thanks Ilya, Steve.
> 1) What do you mean by SQL enabled? Do I still need to define the POJO
> classes for the objects/tables?
> 2) Can I specify the caches including the table definitions entirely in
> XML config file and pass the config file to the JDBC connection? If yes,
> I'd greatly appreciate it if you provide some small samples. Please keep in
> mind that we have native persistence in place not a third party database.
>
>
>
> On Wed, Jan 15, 2020 at 7:29 AM Ilya Kasnacheev 
> wrote:
>
>> Hello!
>>
>> 4) I actually think that if you specify caches in thick client's config
>> file, and they are absent on server, they will be created.
>>
>> (However, they will not be changed if configuration differs)
>>
>> Regards,
>> --
>> Ilya Kasnacheev
>>
>>
>> ср, 15 янв. 2020 г. в 15:59, narges saleh :
>>
>>> Hi All,
>>>
>>> I am trying to use ignite's cache grid with native persistence and
>>> prefer to use JDBC for cache/db connectivity.
>>>
>>> 1) Is this possible, in either client or server mode?
>>> 2) If yes, I assume, I'd need one JDBC connection per cache, as I see it
>>> is possible to specify only one cache per JDBC connection. Is this right?
>>> 3) Is this also true if I need to join multiple tables/caches?
>>> 4) Can I specify my caches in XML config file and just pass the config
>>> file to the JDBC connection?
>>> 5) Will I get the same load performance if I JDBC with streaming set to
>>> true as I'd using the streamer module directly (I see that I can specify
>>> most of the streamer config options on JDBC connection configuration)?
>>>
>>> thanks.
>>>
>>
>
>


Persistent Data Only Available after Repeatedly Restarting Pod in k8s

2020-01-16 Thread kellan
I'm running Ignite 2.7.6 on Kubernetes and have noticed that my persistent
data often isn't available to me after restarting my Ignite pod. Sometimes
I'll have to restart the pod 1 or more times before I can access any of my
data.

What could be causing this?



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Re: Streamer and data loss

2020-01-16 Thread narges saleh
Hello Ilya,

If I use putAll() operation then I won't get the streamer's bulk
performance, would I? I have a huge amount of data to persist.

thanks.

On Thu, Jan 16, 2020 at 8:43 AM Ilya Kasnacheev 
wrote:

> Hello!
>
> I think you should consider using putAll() operation if resiliency is
> important for you, since this operation will be salvaged if initiator node
> fails.
>
> Regards,
> --
> Ilya Kasnacheev
>
>
> чт, 16 янв. 2020 г. в 15:48, narges saleh :
>
>> Thanks Saikat.
>>
>> I am not sure if sequential keys/timestamps and Kafka like offsets would
>> help if there are many data source clients and many streamer nodes in play;
>> depending on the checkpoint, we might still end up duplicates (unless
>> you're saying each client sequences its payload before sending it to the
>> streamer; even then duplicates are possible on the cache). The only sure
>> way, it seems to me, is for the client that catches the exception to check
>> the cache and only resend the diff, which make things very complex. The
>> other approach, if I am right is, to enable overwrite, so the streamer
>> would dedup the data in cache. The latter is costly too. I think the ideal
>> approach would have been if there were some type of streamer resiliency in
>> place where another streamer node could pick up the buffer from a crashed
>> streamer and continue the work.
>>
>>
>> On Wed, Jan 15, 2020 at 9:00 PM Saikat Maitra 
>> wrote:
>>
>>> Hi,
>>>
>>> To minimise data loss during streamer node failure I think we can use
>>> the following steps:
>>>
>>> 1. Use autoFlushFrequency param to set the desired flush frequency,
>>> depending on desired consistency level and performance you can choose how
>>> frequently you would like the data to be flush to Ignite nodes.
>>>
>>> 2. Develop a automated checkpointing process to capture and store the
>>> source data offset, it can be something like kafka message offset or cache
>>> keys if keys are sequential or timestamp for last flush and depending on
>>> that the Ignite client can restart the data streaming process from last
>>> checkpoint if there are node failure.
>>>
>>> HTH
>>>
>>> Regards,
>>> Saikat
>>>
>>> On Fri, Jan 10, 2020 at 4:34 AM narges saleh 
>>> wrote:
>>>
 Thanks Saikat for the feedback.

 But if I use the overwrite option set to true to avoid duplicates in
 case I have to resend the entire payload in case of a streamer node
 failure, then I won't
  get optimal performance, right?
 What's the best practice for dealing with data streamer node failures?
 Are there examples?

 On Thu, Jan 9, 2020 at 9:12 PM Saikat Maitra 
 wrote:

> Hi,
>
> AFAIK, the DataStreamer check for presence of key and if it is present
> in the cache then it does not allow overwrite of value if allowOverwrite 
> is
> set to false.
>
> Regards,
> Saikat
>
> On Thu, Jan 9, 2020 at 6:04 AM narges saleh 
> wrote:
>
>> Thanks Andrei.
>>
>> If the external data source client sending batches of 2-3 MB say via
>> TCP socket connection to a bunch of socket streamers (deployed as ignite
>> services deployed to each ignite node) and say of the streamer nodes die,
>> the data source client catching the exception, has to check the cache to
>> see how much of the 2-4MB batch has been flushed to cache and resend the
>> rest? Would setting streamer with overwrite set to true work, if the data
>> source client resend the entire batch?
>> A question regarding streamer with overwrite option set to true. How
>> does the streamer compare the content the data in hand with the data in
>> cache, if each record is being assigned UUID when being  inserted to 
>> cache?
>>
>>
>> On Tue, Jan 7, 2020 at 4:40 AM Andrei Aleksandrov <
>> aealexsand...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> Not flushed data in a data streamer will be lost. Data streamer
>>> works
>>> thought some Ignite node and in case if this the node failed it
>>> can't
>>> somehow start working with another one. So your application should
>>> think
>>> about how to track that all data was loaded (wait for completion of
>>> loading, catch the exceptions, check the cache sizes, etc) and use
>>> another client for data loading in case if previous one was failed.
>>>
>>> BR,
>>> Andrei
>>>
>>> 1/6/2020 2:37 AM, narges saleh пишет:
>>> > Hi All,
>>> >
>>> > Another question regarding ignite's streamer.
>>> > What happens to the data if the streamer node crashes before the
>>> > buffer's content is flushed to the cache? Is the client
>>> responsible
>>> > for making sure the data is persisted or ignite redirects the data
>>> to
>>> > another node's streamer?
>>> >
>>> > thanks.
>>>
>>


Re: Ignite partitioned mode not scaling

2020-01-16 Thread Ilya Kasnacheev
Hello!

Unfortunately we can only help you if we have a runnable reproducer,
meaning you should prepare a stripped down project which still exhibits
this behavior while not containing any proprietary information, share it
with us via e.g. GitHub.

Regards,
-- 
Ilya Kasnacheev


пт, 3 янв. 2020 г. в 20:24, Rajan Ahlawat :

> We are using following ignite client :
>
> org.apache.ignite:ignite-core:2.6.0
> org.apache.ignite:ignite-spring-data:2.6.0
>
> Benchmark source code is pretty simple it does following :
>
> Executors.newFixedThreadPool(threadPoolSize) working behind rateLimiter
> executes threads
> Each threads makes get query in three cache IgniteRepository tables,
> something like this :
> memberCacheRepository.getMemberCacheObjectByMemberUuid(memberUuid)
>
> cache is created during spring boot application load via
> igniteCacheConfiguration like :
>
> CacheConfiguration createSqlCacheConfig(String cacheName, String dataRegion) {
> CacheConfiguration sqlCacheConfig = new CacheConfiguration(cacheName);
> sqlCacheConfig.setBackups(0);
> 
> sqlCacheConfig.setWriteSynchronizationMode(CacheWriteSynchronizationMode.PRIMARY_SYNC);
> sqlCacheConfig.setCacheMode(CacheMode.PARTITIONED);
> sqlCacheConfig.setDataRegionName(dataRegion);
> return sqlCacheConfig;
> }
>
> I am sorry but won't be able to share the complete code, please let me know 
> what specific information is required.
>
>
>
> On Fri, Jan 3, 2020 at 2:45 PM Mikhail Cherkasov 
> wrote:
>
>> What type of client do you use? is it JDBC thin driver?
>>
>> The best if you can share benchmark source code, so we can see what
>> queries you use, what flags you set to queries and etc.
>>
>> On Thu, Jan 2, 2020 at 10:07 PM Rajan Ahlawat 
>> wrote:
>>
>>> If QPS > 2000 I am using multiple hosts for application which is
>>> shooting requests to cache.
>>> If benchmark is the bottleneck, we shouldn't see drop from 2600 to 2200
>>> when we go from 1 to 3 node cluster.
>>>
>>> On Fri, Jan 3, 2020 at 11:24 AM Rajan Ahlawat 
>>> wrote:
>>>
 Hi Mikhail

 could you please share the benchmark code with us?
 I am first filling up around a million records in cache. Then through
 direct cache service classes, fetching those records randomly.

 do you run queries against the same amount of records each time?
 Yes, 2600 QPS means, it picks 2600 records randomly over a second and
 do get query over sql caches of different tables.

 what host machines do you use for your nodes? when you say that you
 have 5 nodes, does it mean that you use 5 dedicates machines for each node?
 Yes, these are five dedicated linux machines.

 Also, it might be that the benchmark itself is the bottleneck, so your
 system can handle more QPS, but you need to run a benchmark from several
 machines. Please try to use at least 2 hosts for the benchmark application
 and check if there any changes in QPS.
 As you can see in the table, I have tried with different combinations
 on nodes, and with increase in nodes, our qps of requests being served
 under 50ms is getting down each time.


 On Fri, Jan 3, 2020 at 1:29 AM Mikhail Cherkasov <
 mcherka...@gridgain.com> wrote:

> Hi Rajan,
>
> could you please share the benchmark code with us?
> do you run queries against the same amount of records each time?
> what host machines do you use for your nodes? when you say that you
> have 5 nodes, does it mean that you use 5 dedicates machines for each 
> node?
> Also, it might be that the benchmark itself is the bottleneck, so your
> system can handle more QPS, but you need to run a benchmark from several
> machines. Please try to use at least 2 hosts for the benchmark application
> and check if there any changes in QPS.
>
> Thanks,
> Mike.
>
> On Thu, Jan 2, 2020 at 2:49 AM Rajan Ahlawat 
> wrote:
>
>>
>>
>> -- Forwarded message -
>> From: Rajan Ahlawat 
>> Date: Thu, Jan 2, 2020 at 4:05 PM
>> Subject: Ignite partitioned mode not scaling
>> To: 
>>
>>
>> We are moving from replicated (1-node cluster) to multinode
>> partitioned cluster.
>> So assumption was that max QPS we can reach would be more if no. of
>> nodes are added to cluster.
>> We compared under 50ms QPS stats of partitioned mode with increasing
>> no. of nodes in cluster, and found that performance actually degraded.
>> We are using ignite key value as well as sql cache, where most of the
>> data in sql cache, no persistence is being used.
>>
>> please let us know what we are doing wrong or what can be done to
>> make it scalable.
>> here are the results of perf tests :
>>
>> *50ms in 95 percentile comparison of partitioned-mode*
>>
>> Response time in ms
>> cache mode (partitioned)QPSread from sql tableread from sql table
>> with

Re: Node reconnects to Zookeeper in event of GC

2020-01-16 Thread Ilya Kasnacheev
Hello!

I don't think a node should re-register. Node should either observe
failureDetectionTimeout (and corresponding Zk timeouts) or segment from the
cluster and shutdown.

You can restart node later, manually or automatically.

Regards,
-- 
Ilya Kasnacheev


вт, 7 янв. 2020 г. в 03:16, swattal :

> Thanks for the response Evgenii.
>
> If i choose to use NOOP SegmentationPolicy will this cause the Split brains
> issue in the cluster? We are not using ignite.sh script to start the
> servers
> but using the Java API to start the cluster. Is there a possibility to
> re-register the server node back to the ZK using a watcher?
>
> Thanks,
> Sumit
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>


Re: Continuous query order on transactional cache

2020-01-16 Thread Ilya Kasnacheev
Hello!

Why?

I don't think that transactions, semantically have any guarantees about the
order of updates inside a transaction.

I'd go with A).

Regards,
-- 
Ilya Kasnacheev


чт, 16 янв. 2020 г. в 17:39, Barney Pippin :

> Hi,
>
> If I have a continuous query running and it's listening to a transactional
> cache, what order will I receive the notifications if say 5 updates are
> committed in a single transaction?
>
> Is the order:
> A) Undefined
> B) The order the cache updates are written to the cache prior to the commit
> C) Another order?
>
> Thanks,
>
> James
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>


Re: Streamer and data loss

2020-01-16 Thread Ilya Kasnacheev
Hello!

I think you should consider using putAll() operation if resiliency is
important for you, since this operation will be salvaged if initiator node
fails.

Regards,
-- 
Ilya Kasnacheev


чт, 16 янв. 2020 г. в 15:48, narges saleh :

> Thanks Saikat.
>
> I am not sure if sequential keys/timestamps and Kafka like offsets would
> help if there are many data source clients and many streamer nodes in play;
> depending on the checkpoint, we might still end up duplicates (unless
> you're saying each client sequences its payload before sending it to the
> streamer; even then duplicates are possible on the cache). The only sure
> way, it seems to me, is for the client that catches the exception to check
> the cache and only resend the diff, which make things very complex. The
> other approach, if I am right is, to enable overwrite, so the streamer
> would dedup the data in cache. The latter is costly too. I think the ideal
> approach would have been if there were some type of streamer resiliency in
> place where another streamer node could pick up the buffer from a crashed
> streamer and continue the work.
>
>
> On Wed, Jan 15, 2020 at 9:00 PM Saikat Maitra 
> wrote:
>
>> Hi,
>>
>> To minimise data loss during streamer node failure I think we can use the
>> following steps:
>>
>> 1. Use autoFlushFrequency param to set the desired flush frequency,
>> depending on desired consistency level and performance you can choose how
>> frequently you would like the data to be flush to Ignite nodes.
>>
>> 2. Develop a automated checkpointing process to capture and store the
>> source data offset, it can be something like kafka message offset or cache
>> keys if keys are sequential or timestamp for last flush and depending on
>> that the Ignite client can restart the data streaming process from last
>> checkpoint if there are node failure.
>>
>> HTH
>>
>> Regards,
>> Saikat
>>
>> On Fri, Jan 10, 2020 at 4:34 AM narges saleh 
>> wrote:
>>
>>> Thanks Saikat for the feedback.
>>>
>>> But if I use the overwrite option set to true to avoid duplicates in
>>> case I have to resend the entire payload in case of a streamer node
>>> failure, then I won't
>>>  get optimal performance, right?
>>> What's the best practice for dealing with data streamer node failures?
>>> Are there examples?
>>>
>>> On Thu, Jan 9, 2020 at 9:12 PM Saikat Maitra 
>>> wrote:
>>>
 Hi,

 AFAIK, the DataStreamer check for presence of key and if it is present
 in the cache then it does not allow overwrite of value if allowOverwrite is
 set to false.

 Regards,
 Saikat

 On Thu, Jan 9, 2020 at 6:04 AM narges saleh 
 wrote:

> Thanks Andrei.
>
> If the external data source client sending batches of 2-3 MB say via
> TCP socket connection to a bunch of socket streamers (deployed as ignite
> services deployed to each ignite node) and say of the streamer nodes die,
> the data source client catching the exception, has to check the cache to
> see how much of the 2-4MB batch has been flushed to cache and resend the
> rest? Would setting streamer with overwrite set to true work, if the data
> source client resend the entire batch?
> A question regarding streamer with overwrite option set to true. How
> does the streamer compare the content the data in hand with the data in
> cache, if each record is being assigned UUID when being  inserted to 
> cache?
>
>
> On Tue, Jan 7, 2020 at 4:40 AM Andrei Aleksandrov <
> aealexsand...@gmail.com> wrote:
>
>> Hi,
>>
>> Not flushed data in a data streamer will be lost. Data streamer works
>> thought some Ignite node and in case if this the node failed it can't
>> somehow start working with another one. So your application should
>> think
>> about how to track that all data was loaded (wait for completion of
>> loading, catch the exceptions, check the cache sizes, etc) and use
>> another client for data loading in case if previous one was failed.
>>
>> BR,
>> Andrei
>>
>> 1/6/2020 2:37 AM, narges saleh пишет:
>> > Hi All,
>> >
>> > Another question regarding ignite's streamer.
>> > What happens to the data if the streamer node crashes before the
>> > buffer's content is flushed to the cache? Is the client responsible
>> > for making sure the data is persisted or ignite redirects the data
>> to
>> > another node's streamer?
>> >
>> > thanks.
>>
>


Continuous query order on transactional cache

2020-01-16 Thread Barney Pippin
Hi,

If I have a continuous query running and it's listening to a transactional
cache, what order will I receive the notifications if say 5 updates are
committed in a single transaction?

Is the order:
A) Undefined
B) The order the cache updates are written to the cache prior to the commit
C) Another order?

Thanks,

James



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Re: JDBC Connectivity

2020-01-16 Thread Stephen Darlington
If you create a cache, either in code or XML, using the minimal list of 
parameter it won’t be accessible using SQL.

There are a number of ways you can define what’s visible using SQL. You can use 
a POJO with the @QuerySqlField annotation (and the indexTypes property in the 
XML file) or define QueryEntities. See the documentarian: 
https://www.gridgain.com/docs/latest/developers-guide/SQL/indexes 

Whether you do it on the client or server side is a bit of a religious debate, 
but either works. The important thing is that the first definition to hit the 
cluster is the one that takes effect. 

The most common pattern I see with JDBC is the caches are defined server side, 
and clients connect using the thin-client driver. Thin clients just need a 
hostname and port.

However, there is also a thick-client JDBC driver. The XML here is no different 
from any other node.

Regards,
Stephen

> On 16 Jan 2020, at 12:54, narges saleh  wrote:
> 
> Thanks Ilya, Steve.
> 1) What do you mean by SQL enabled? Do I still need to define the POJO 
> classes for the objects/tables?
> 2) Can I specify the caches including the table definitions entirely in XML 
> config file and pass the config file to the JDBC connection? If yes, I'd 
> greatly appreciate it if you provide some small samples. Please keep in mind 
> that we have native persistence in place not a third party database.
> 
> 
> 
> On Wed, Jan 15, 2020 at 7:29 AM Ilya Kasnacheev  > wrote:
> Hello!
> 
> 4) I actually think that if you specify caches in thick client's config file, 
> and they are absent on server, they will be created.
> 
> (However, they will not be changed if configuration differs)
> 
> Regards,
> -- 
> Ilya Kasnacheev
> 
> 
> ср, 15 янв. 2020 г. в 15:59, narges saleh  >:
> Hi All,
> 
> I am trying to use ignite's cache grid with native persistence and prefer to 
> use JDBC for cache/db connectivity.
> 
> 1) Is this possible, in either client or server mode?
> 2) If yes, I assume, I'd need one JDBC connection per cache, as I see it is 
> possible to specify only one cache per JDBC connection. Is this right?
> 3) Is this also true if I need to join multiple tables/caches?
> 4) Can I specify my caches in XML config file and just pass the config file 
> to the JDBC connection?
> 5) Will I get the same load performance if I JDBC with streaming set to true 
> as I'd using the streamer module directly (I see that I can specify most of 
> the streamer config options on JDBC connection configuration)?
> 
> thanks.




Re: JDBC Connectivity

2020-01-16 Thread narges saleh
Thanks Ilya, Steve.
1) What do you mean by SQL enabled? Do I still need to define the POJO
classes for the objects/tables?
2) Can I specify the caches including the table definitions entirely in XML
config file and pass the config file to the JDBC connection? If yes, I'd
greatly appreciate it if you provide some small samples. Please keep in
mind that we have native persistence in place not a third party database.



On Wed, Jan 15, 2020 at 7:29 AM Ilya Kasnacheev 
wrote:

> Hello!
>
> 4) I actually think that if you specify caches in thick client's config
> file, and they are absent on server, they will be created.
>
> (However, they will not be changed if configuration differs)
>
> Regards,
> --
> Ilya Kasnacheev
>
>
> ср, 15 янв. 2020 г. в 15:59, narges saleh :
>
>> Hi All,
>>
>> I am trying to use ignite's cache grid with native persistence and prefer
>> to use JDBC for cache/db connectivity.
>>
>> 1) Is this possible, in either client or server mode?
>> 2) If yes, I assume, I'd need one JDBC connection per cache, as I see it
>> is possible to specify only one cache per JDBC connection. Is this right?
>> 3) Is this also true if I need to join multiple tables/caches?
>> 4) Can I specify my caches in XML config file and just pass the config
>> file to the JDBC connection?
>> 5) Will I get the same load performance if I JDBC with streaming set to
>> true as I'd using the streamer module directly (I see that I can specify
>> most of the streamer config options on JDBC connection configuration)?
>>
>> thanks.
>>
>


Re: Streamer and data loss

2020-01-16 Thread narges saleh
Thanks Saikat.

I am not sure if sequential keys/timestamps and Kafka like offsets would
help if there are many data source clients and many streamer nodes in play;
depending on the checkpoint, we might still end up duplicates (unless
you're saying each client sequences its payload before sending it to the
streamer; even then duplicates are possible on the cache). The only sure
way, it seems to me, is for the client that catches the exception to check
the cache and only resend the diff, which make things very complex. The
other approach, if I am right is, to enable overwrite, so the streamer
would dedup the data in cache. The latter is costly too. I think the ideal
approach would have been if there were some type of streamer resiliency in
place where another streamer node could pick up the buffer from a crashed
streamer and continue the work.


On Wed, Jan 15, 2020 at 9:00 PM Saikat Maitra 
wrote:

> Hi,
>
> To minimise data loss during streamer node failure I think we can use the
> following steps:
>
> 1. Use autoFlushFrequency param to set the desired flush frequency,
> depending on desired consistency level and performance you can choose how
> frequently you would like the data to be flush to Ignite nodes.
>
> 2. Develop a automated checkpointing process to capture and store the
> source data offset, it can be something like kafka message offset or cache
> keys if keys are sequential or timestamp for last flush and depending on
> that the Ignite client can restart the data streaming process from last
> checkpoint if there are node failure.
>
> HTH
>
> Regards,
> Saikat
>
> On Fri, Jan 10, 2020 at 4:34 AM narges saleh  wrote:
>
>> Thanks Saikat for the feedback.
>>
>> But if I use the overwrite option set to true to avoid duplicates in case
>> I have to resend the entire payload in case of a streamer node failure,
>> then I won't
>>  get optimal performance, right?
>> What's the best practice for dealing with data streamer node failures?
>> Are there examples?
>>
>> On Thu, Jan 9, 2020 at 9:12 PM Saikat Maitra 
>> wrote:
>>
>>> Hi,
>>>
>>> AFAIK, the DataStreamer check for presence of key and if it is present
>>> in the cache then it does not allow overwrite of value if allowOverwrite is
>>> set to false.
>>>
>>> Regards,
>>> Saikat
>>>
>>> On Thu, Jan 9, 2020 at 6:04 AM narges saleh 
>>> wrote:
>>>
 Thanks Andrei.

 If the external data source client sending batches of 2-3 MB say via
 TCP socket connection to a bunch of socket streamers (deployed as ignite
 services deployed to each ignite node) and say of the streamer nodes die,
 the data source client catching the exception, has to check the cache to
 see how much of the 2-4MB batch has been flushed to cache and resend the
 rest? Would setting streamer with overwrite set to true work, if the data
 source client resend the entire batch?
 A question regarding streamer with overwrite option set to true. How
 does the streamer compare the content the data in hand with the data in
 cache, if each record is being assigned UUID when being  inserted to cache?


 On Tue, Jan 7, 2020 at 4:40 AM Andrei Aleksandrov <
 aealexsand...@gmail.com> wrote:

> Hi,
>
> Not flushed data in a data streamer will be lost. Data streamer works
> thought some Ignite node and in case if this the node failed it can't
> somehow start working with another one. So your application should
> think
> about how to track that all data was loaded (wait for completion of
> loading, catch the exceptions, check the cache sizes, etc) and use
> another client for data loading in case if previous one was failed.
>
> BR,
> Andrei
>
> 1/6/2020 2:37 AM, narges saleh пишет:
> > Hi All,
> >
> > Another question regarding ignite's streamer.
> > What happens to the data if the streamer node crashes before the
> > buffer's content is flushed to the cache? Is the client responsible
> > for making sure the data is persisted or ignite redirects the data
> to
> > another node's streamer?
> >
> > thanks.
>