Re: Batch updates in Ignite B+ tree.

2019-03-05 Thread Vladimir Ozerov
Hi Pavel,

As far as I know batch tree updates already being developed. Alex, could
you please elaborate?

On Tue, Mar 5, 2019 at 5:05 PM Pavel Pereslegin  wrote:

> Hi Igniters!
>
> I am working on implementing batch updates in PageMemory [1] to
> improve the performance of preloader, datastreamer and putAll.
>
> This task consists of two major related improvements:
> 1. Batch writing to PageMemory via FreeList - store several values at
> once to single memory page.
> 2. Batch updates in BPlusTree (for introducing invokeAll operation).
>
> I started to investigate the issue with batch updates in B+ tree, and
> it seems that the concurrent top-down balancing algorithm (TD)
> described in this paper [2] may be suitable for batch insertion of
> keys into Ignite B+ Tree.
> This algorithm uses a top-down balancing approach and allows to insert
> a batch of keys belonging to the leaves having the same parent. The
> negative point of top-down balancing approach is that the parent node
> is locked when performing insertion/splitting in child nodes.
>
> WDYT? Do you know other approaches for implementing batch updates in
> Ignite B+ Tree?
>
> [1] https://issues.apache.org/jira/browse/IGNITE-7935
> [2]
> https://aaltodoc.aalto.fi/bitstream/handle/123456789/2168/isbn9512258951.pdf
>


[jira] [Created] (IGNITE-11487) Document IGNITE_SQL_MERGE_TABLE_MAX_SIZE property

2019-03-05 Thread Evgenii Zhuravlev (JIRA)
Evgenii Zhuravlev created IGNITE-11487:
--

 Summary: Document IGNITE_SQL_MERGE_TABLE_MAX_SIZE property
 Key: IGNITE-11487
 URL: https://issues.apache.org/jira/browse/IGNITE-11487
 Project: Ignite
  Issue Type: Improvement
  Components: documentation
Reporter: Evgenii Zhuravlev
Assignee: Prachi Garg






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSSION] System cache persistence.

2019-03-05 Thread Vyacheslav Daradur
Hi, Andrey!

>> 6. ServiceGrid.
>> We can use Metastore and drop old-services later.

As you mentioned new Service Grid does not use system cache.

Legacy implementation (GridServiceProcessor), uses system cache
WITHOUT persistence since 2.3 release and does not restore services
state at node restart [1].

[1] https://issues.apache.org/jira/browse/IGNITE-6629

On Tue, Mar 5, 2019 at 4:21 PM Andrey Mashenkov
 wrote:
>
> Hi Igniters,
>
> I'd like to start a discussion to avoid system cache usage with persistence.
>
> System cache is used in number of component internals.  No one cares
> system cache can have stale data after grid restart as it wasn't impossible
> before 2.1.
> From Ignite 2.1 version it is possible to be persistent that may affect
> components behavior:
> Compute, IGFS, ServiceGrid, DataStructures.
>
> What's wrong?
> 1. System cache persistent only if default region is configured as
> persistent (and vice versa).
> This is non-obvious and can causes unpredictable issues.
>
> 2. Any change in system cache requires distributed transaction that may
> causes a deadlock.
> We already avoid its usage in BinaryMarshaller and (almost) in recently
> reworked ServiceGrid due to the "deadlock" reason.
>
>
> What has been affected? and we can do?
> 3. IGFS
> AFAIK, IGFS support is going to be discontinued. There is nothing to do if
> IGFS will be removed in 3.0.
>
> 4. DataStrucutres
> Its looks broken (may be partially) as I see  CacheDataStructuresManager
> uses on-heap maps for id->structure mapping for some structures.
> Look like it is safe to deprecate persistence for datastructures for now
> and rework them separately.
> Also, from user perspective, I'd expect datastructures persistence be
> configured in some separate place or in datastructure configuration.
>
> 5. Compute
> Let's rework this to use Metastore.
>
> 6. ServiceGrid.
> We can use Metastore and drop old-services later.
>
> 5. Some 3-rd party plugins may be affected.
> Of course, there is no compatibility guarantee if someone uses internal
> components, but the issue #1 can make user frustrated.
> We can prevent system cache being persistent.
>
> Do we really ever need System cache with persistence enabled?
> Thoughts?
>
> I've create a ticket for this [1].
>
> [1] https://issues.apache.org/jira/browse/IGNITE-11483
>
> --
> Best regards,
> Andrey V. Mashenkov



--
Best Regards, Vyacheslav D.


Re: Tests for ML using binary builds

2019-03-05 Thread Vyacheslav Daradur
Hi, Alexey!

>>  If we can use multi JVM test with
>> different classpaths I will use them - such approach is more convenient
>> from TC point of view.

There is not such ability at the moment, you are only able to specify
additional JVM arguments in
'GridAbstractTest#additionalRemoteJvmArgs'. But, it is not very hard
to implement it if needed, see 'IgniteNodeRunner'.

We use such approach in our Compatibility Framework.
BTW, it is possible to use the framework for your goals if prepared
and installed artefacts in Maven local repository (mvn install) then
call 'startGrid(name, ver)' with your prepared version, e.g.
"2.8-SNAPSHOT".

On Tue, Mar 5, 2019 at 2:48 PM Alexey Platonov  wrote:
>
> Ivan,
> Thank for your answer. I want to use binary builds explicitly because they
> don't share jars of client code. If we can use multi JVM test with
> different classpaths I will use them - such approach is more convenient
> from TC point of view.
>
> P.S. I use Docker in my prototype just because it is easy for me and for
> test cluster management - I can create docker-image with all configs and
> scripts and run Ignite cluster in a separate network.
>
> On Tue, Mar 5, 2019 at 12:28 PM Павлухин Иван  wrote:
>
> > Alexey,
> >
> > If problems arise in environments different from one where usual
> > Ignite tests run then definitely it is a good idea to cover it. And
> > testing other build kinds and in other environments is a good idea as
> > well. But a particular problem with serialization and peer class
> > loading is not clear for me. Why binary builds and Docker are needed
> > there? Why multi JVM tests from Ignite testing framework cannot reveal
> > mentioned problems?
> >
> > Ideally I think we should aggregate all failure reporting in common
> > place. And for me TC bot is the best choice. Consequently it should be
> > TeamCity most likely.
> >
> > But all in all I think we can give it a try according to you proposal
> > and see how the things will go.
> >
> > вт, 5 мар. 2019 г. в 11:09, dmitrievanthony :
> > >
> > > Hi Alexey,
> > >
> > > I think it's a great idea. Travis + Docker is a very good and cheap
> > > solution, so we could start with it. Regards the statistics, Travis
> > allows
> > > to check a last build status using a badge, so it also shouldn't be a
> > > problem.
> > >
> > > Best regards,
> > > Anton Dmitriev.
> > >
> > >
> > >
> > > --
> > > Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
> >
> >
> >
> > --
> > Best regards,
> > Ivan Pavlukhin
> >



-- 
Best Regards, Vyacheslav D.


[jira] [Created] (IGNITE-11486) Support Automatic modules for ignite-zookeeper: Resolve issues with logging packages conflict

2019-03-05 Thread Dmitriy Pavlov (JIRA)
Dmitriy Pavlov created IGNITE-11486:
---

 Summary: Support Automatic modules for ignite-zookeeper: Resolve 
issues with logging packages conflict
 Key: IGNITE-11486
 URL: https://issues.apache.org/jira/browse/IGNITE-11486
 Project: Ignite
  Issue Type: Sub-task
Reporter: Dmitriy Pavlov


Usage of Ignite Zookeeper module in a modular environment failed
{noformat}
error: the unnamed module reads package org.apache.log4j from both 
slf4j.log4j12 and log4j
{noformat}

slf4j version is updated by the build system when Ignite Zookeeper is used.
{noformat}
+--- org.slf4j:slf4j-api:1.7.7 -> 1.7.25
+--- org.slf4j:slf4j-log4j12:1.7.7 -> 1.7.25
  +--- org.slf4j:slf4j-api:1.7.25
  \--- log4j:log4j:1.2.17
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-11485) Support Automatic modules for ignite-hibernate: Package interference with hibernate core and hibernate for particular verion

2019-03-05 Thread Dmitriy Pavlov (JIRA)
Dmitriy Pavlov created IGNITE-11485:
---

 Summary: Support Automatic modules for ignite-hibernate: Package 
interference with hibernate core and hibernate for particular verion
 Key: IGNITE-11485
 URL: https://issues.apache.org/jira/browse/IGNITE-11485
 Project: Ignite
  Issue Type: Sub-task
Reporter: Dmitriy Pavlov


Hibernate 5.3:
{noformat}
error: the unnamed module reads package org.apache.ignite.cache.hibernate from 
both ignite.hibernate.5.3 and ignite.hibernate.core
{noformat}

Hibernate 5.1:
{noformat}
error: the unnamed module reads package org.apache.ignite.cache.hibernate from 
both ignite.hibernate.core and ignite.hibernate.5.1
{noformat}

Hibernate 4.2:
{noformat}
error: the unnamed module reads package org.apache.ignite.cache.hibernate from 
both ignite.hibernate.core and ignite.hibernate.4.2
{noformat}

Probably we should be classes from hibernate-core module to  
org.apache.ignite.cache.hibernate.core package, but this may affect public API

Following class will be moved in case we change core package:
- HibernateAccessStrategyAdapter
- HibernateAccessStrategyFactory
- HibernateCacheProxy
- HibernateExceptionConverter
- HibernateKeyTransformer
- HibernateNonStrictAccessStrategy
- HibernateReadOnlyAccessStrategy
- HibernateReadWriteAccessStrategy
- HibernateTransactionalAccessStrategy

Alternative solution: Hibernate 5.3 is not yet released so we could move 
implementation for the newest version to its own subpackage. Formally it would 
not be a breaking change.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSSION] Channel communication between nodes

2019-03-05 Thread Павлухин Иван
Maxim,

My humble opinion. If there is no convenient means to implement
partition file sending today then we should introduce something. And
keeping such facility private is much easier, because introduction of
new public API is a significantly more complex task.

пт, 1 мар. 2019 г. в 19:44, Maxim Muzafarov :
>
> Igniters,
>
> Apache Ignite has a very suitable messaging user interface [1] for
> topic-based communication between nodes (or a specific group of nodes
> within a cluster). The messaging functionality in Ignite is provided
> via IgniteMessaging interface. It allows:
> - send a message to a certain topic
> - register local\remote listeners
>
> I really like this feature, but the disadvantage here is when the user
> wants to transfer a large amount of binary data (e.g. files) between
> nodes he must create a complex logic to wrap it into messages. I think
> Ignite could have an interface e.g. IgniteChannels which will allow:
> - register local\remote listeners for channel created\destroy events.
> - create a channel connection (a wrapped socket channel) to a certain
> node\group of nodes and the desired topic
>
> As another suitable case where such a feature can be applied is
> internal usage for Apache Ignite needs. I can mention here the task of
> cluster rebalancing by sending cache partition files between nodes.
> I've posted a small description of it on the IEP-28 page [2].
>
>
> WDYT about it?
>
> ---
>
> API (assumed)
>
> IgniteChannels chnls = ignite0.channels();
> chnls.remoteListen(TOPIC.MY_TOPIC, new RemoteListener());
>
> IgniteSocketChannel ch0 = chnls.channel(node, TOPIC.MY_TOPIC);
> ch0.writeInt(bigFile.size());
> ch0.transferTo(FileChannel.open(bigFile.path(), StandardOpenOption.READ))
>
>
> /** */
>
> private class RemoteListener
> implements IgniteBiPredicate {
>
> @IgniteInstanceResource
> private Ignite ignite;
>
> @Override public boolean apply(
> UUID nodeId,
> IgniteSocketChannel ch
> ) {
> int size = ch.readInt();
> ignite.fileSystem("base")
> .create("bigfile.mpg")
> .transferFrom(ch, size);
> return true;
> }
> }
>
>
> [1] https://apacheignite.readme.io/docs/messaging
> [2] 
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-28%3A+Cluster+peer-2-peer+balancing#IEP-28:Clusterpeer-2-peerbalancing-CommunicationSpi



-- 
Best regards,
Ivan Pavlukhin


Batch updates in Ignite B+ tree.

2019-03-05 Thread Pavel Pereslegin
Hi Igniters!

I am working on implementing batch updates in PageMemory [1] to
improve the performance of preloader, datastreamer and putAll.

This task consists of two major related improvements:
1. Batch writing to PageMemory via FreeList - store several values at
once to single memory page.
2. Batch updates in BPlusTree (for introducing invokeAll operation).

I started to investigate the issue with batch updates in B+ tree, and
it seems that the concurrent top-down balancing algorithm (TD)
described in this paper [2] may be suitable for batch insertion of
keys into Ignite B+ Tree.
This algorithm uses a top-down balancing approach and allows to insert
a batch of keys belonging to the leaves having the same parent. The
negative point of top-down balancing approach is that the parent node
is locked when performing insertion/splitting in child nodes.

WDYT? Do you know other approaches for implementing batch updates in
Ignite B+ Tree?

[1] https://issues.apache.org/jira/browse/IGNITE-7935
[2] https://aaltodoc.aalto.fi/bitstream/handle/123456789/2168/isbn9512258951.pdf


[jira] [Created] (IGNITE-11484) Get rid of ForkJoinPool#commonPool usage for csystem critical tasks

2019-03-05 Thread Ivan Rakov (JIRA)
Ivan Rakov created IGNITE-11484:
---

 Summary: Get rid of ForkJoinPool#commonPool usage for csystem 
critical tasks
 Key: IGNITE-11484
 URL: https://issues.apache.org/jira/browse/IGNITE-11484
 Project: Ignite
  Issue Type: Bug
Reporter: Ivan Rakov
Assignee: Ivan Rakov
 Fix For: 2.8


We use ForkJoinPool#commonPool for sorting checkpoint pages.
This may backfire if common pool is already utilized in current JVM: checkpoint 
may wait for sorting for a long time, which in turn will cause user load 
dropdown.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[DISCUSSION] System cache persistence.

2019-03-05 Thread Andrey Mashenkov
Hi Igniters,

I'd like to start a discussion to avoid system cache usage with persistence.

System cache is used in number of component internals.  No one cares
system cache can have stale data after grid restart as it wasn't impossible
before 2.1.
>From Ignite 2.1 version it is possible to be persistent that may affect
components behavior:
Compute, IGFS, ServiceGrid, DataStructures.

What's wrong?
1. System cache persistent only if default region is configured as
persistent (and vice versa).
This is non-obvious and can causes unpredictable issues.

2. Any change in system cache requires distributed transaction that may
causes a deadlock.
We already avoid its usage in BinaryMarshaller and (almost) in recently
reworked ServiceGrid due to the "deadlock" reason.


What has been affected? and we can do?
3. IGFS
AFAIK, IGFS support is going to be discontinued. There is nothing to do if
IGFS will be removed in 3.0.

4. DataStrucutres
Its looks broken (may be partially) as I see  CacheDataStructuresManager
uses on-heap maps for id->structure mapping for some structures.
Look like it is safe to deprecate persistence for datastructures for now
and rework them separately.
Also, from user perspective, I'd expect datastructures persistence be
configured in some separate place or in datastructure configuration.

5. Compute
Let's rework this to use Metastore.

6. ServiceGrid.
We can use Metastore and drop old-services later.

5. Some 3-rd party plugins may be affected.
Of course, there is no compatibility guarantee if someone uses internal
components, but the issue #1 can make user frustrated.
We can prevent system cache being persistent.

Do we really ever need System cache with persistence enabled?
Thoughts?

I've create a ticket for this [1].

[1] https://issues.apache.org/jira/browse/IGNITE-11483

-- 
Best regards,
Andrey V. Mashenkov


[jira] [Created] (IGNITE-11483) Make system cache non-persistent and deprecate.

2019-03-05 Thread Andrew Mashenkov (JIRA)
Andrew Mashenkov created IGNITE-11483:
-

 Summary: Make system cache non-persistent and deprecate.
 Key: IGNITE-11483
 URL: https://issues.apache.org/jira/browse/IGNITE-11483
 Project: Ignite
  Issue Type: Bug
  Components: cache, compute, igfs, managed services
Reporter: Andrew Mashenkov


For now, persistent Default Region makes System cache persistent as well (same 
correct for non-persistent region). This behavior is non-obvious and it may 
causes unpredictable issues.

We have number of components that uses system cache, some of them doesn't need 
system cache to be persistent, while other ok with it:
 * DataStructures - datastructures persistence should be configured in it's 
configuration. Moreover, some structures looks broken as 
CacheDataStructureManages uses in-memory maps.
 * Compute - most likely persistence not needed.
 * Services - metastore can be used instead.
 * Igfs - candidate to remove in 3.0

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-11482) MVCC: Error on TxLog initialization.

2019-03-05 Thread Roman Kondakov (JIRA)
Roman Kondakov created IGNITE-11482:
---

 Summary: MVCC: Error on TxLog initialization.
 Key: IGNITE-11482
 URL: https://issues.apache.org/jira/browse/IGNITE-11482
 Project: Ignite
  Issue Type: Bug
  Components: mvcc
Reporter: Roman Kondakov
 Fix For: 2.8


Some [tests remained 
flaky|https://ci.ignite.apache.org/project.html?projectId=IgniteTests24Java8==testDetails=-935846982857542309=TEST_STATUS_DESC=50_IgniteTests24Java8=__all_branches__]
 even after IGNITE-10582 has been fixed. It should be investigated again.
{noformat}
[21:44:14] (err) Failed to execute compound future reducer: GridCompoundFuture 
[rdc=null, initFlag=1, lsnrCalls=0, done=false, cancelled=false, err=null, 
futs=TransformCollectionView [true, false, false, false]]class 
org.apache.ignite.IgniteCheckedException: Failed to complete exchange process.
at 
org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.createExchangeException(GridDhtPartitionsExchangeFuture.java:3209)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.sendExchangeFailureMessage(GridDhtPartitionsExchangeFuture.java:3237)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.finishExchangeOnCoordinator(GridDhtPartitionsExchangeFuture.java:3323)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.onAllReceived(GridDhtPartitionsExchangeFuture.java:3304)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.distributedExchange(GridDhtPartitionsExchangeFuture.java:1519)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.init(GridDhtPartitionsExchangeFuture.java:852)
at 
org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body0(GridCachePartitionExchangeManager.java:2920)
at 
org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:2769)
at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
at java.lang.Thread.run(Thread.java:748)
Suppressed: class org.apache.ignite.IgniteCheckedException: Failed to 
initialize exchange locally [locNodeId=140a9253-f646-4691-9947-2b211a90]
at 
org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.onCacheChangeRequest(GridDhtPartitionsExchangeFuture.java:1254)
at 
org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.init(GridDhtPartitionsExchangeFuture.java:782)
... 4 more
Caused by: java.lang.IllegalStateException: Failed to get page IO 
instance (page content is corrupted)
at 
org.apache.ignite.internal.processors.cache.persistence.tree.io.IOVersions.forVersion(IOVersions.java:85)
at 
org.apache.ignite.internal.processors.cache.persistence.tree.io.IOVersions.forPage(IOVersions.java:97)
at 
org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.init(PagesList.java:181)
at 
org.apache.ignite.internal.processors.cache.persistence.tree.reuse.ReuseListImpl.(ReuseListImpl.java:57)
at 
org.apache.ignite.internal.processors.cache.mvcc.txlog.TxLog.init(TxLog.java:161)
at 
org.apache.ignite.internal.processors.cache.mvcc.txlog.TxLog.(TxLog.java:87)
at 
org.apache.ignite.internal.processors.cache.mvcc.MvccProcessorImpl.ensureStarted(MvccProcessorImpl.java:302)
at 
org.apache.ignite.internal.processors.cache.GridCacheProcessor.createCacheContext(GridCacheProcessor.java:1552)
at 
org.apache.ignite.internal.processors.cache.GridCacheProcessor.prepareCacheContext(GridCacheProcessor.java:2325)
at 
org.apache.ignite.internal.processors.cache.GridCacheProcessor.lambda$null$6a5b31b9$1(GridCacheProcessor.java:2164)
at 
org.apache.ignite.internal.processors.cache.GridCacheProcessor.lambda$prepareStartCachesIfPossible$6(GridCacheProcessor.java:2104)
at 
org.apache.ignite.internal.processors.cache.GridCacheProcessor.lambda$prepareStartCaches$926b6886$1(GridCacheProcessor.java:2161)
at 
org.apache.ignite.internal.util.IgniteUtils.lambda$null$1(IgniteUtils.java:10833)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 

[jira] [Created] (IGNITE-11481) [ML] Prototype of DatasetRow for Vectorizer

2019-03-05 Thread Alexey Platonov (JIRA)
Alexey Platonov created IGNITE-11481:


 Summary: [ML] Prototype of DatasetRow for Vectorizer
 Key: IGNITE-11481
 URL: https://issues.apache.org/jira/browse/IGNITE-11481
 Project: Ignite
  Issue Type: Improvement
  Components: ml
Reporter: Alexey Platonov
Assignee: Alexey Platonov


Vectorizer shold produce DatasetRow object that can contains columns with 
different types (double, string, etc.). It needs for preprocessors working with 
non-double values.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-11480) [ML] Use only Vectorizer API in DatasetTrainer API

2019-03-05 Thread Alexey Platonov (JIRA)
Alexey Platonov created IGNITE-11480:


 Summary: [ML] Use only Vectorizer API in DatasetTrainer API  
 Key: IGNITE-11480
 URL: https://issues.apache.org/jira/browse/IGNITE-11480
 Project: Ignite
  Issue Type: Improvement
  Components: ml
Reporter: Alexey Platonov
Assignee: Alexey Platonov


Use only Vectorizer API in DatasetTrainer API to avoid problems with user 
classes serialization.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-11479) [ML] Use new vectorizer API in PartitionDatasetBuilders

2019-03-05 Thread Alexey Platonov (JIRA)
Alexey Platonov created IGNITE-11479:


 Summary: [ML] Use new vectorizer API in PartitionDatasetBuilders
 Key: IGNITE-11479
 URL: https://issues.apache.org/jira/browse/IGNITE-11479
 Project: Ignite
  Issue Type: Improvement
  Components: ml
Reporter: Alexey Platonov
Assignee: Alexey Platonov


We need to exclude current feature extractors from partition building API and 
replace old extractors with new vectorizer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Tests for ML using binary builds

2019-03-05 Thread Alexey Platonov
Ivan,
Thank for your answer. I want to use binary builds explicitly because they
don't share jars of client code. If we can use multi JVM test with
different classpaths I will use them - such approach is more convenient
from TC point of view.

P.S. I use Docker in my prototype just because it is easy for me and for
test cluster management - I can create docker-image with all configs and
scripts and run Ignite cluster in a separate network.

On Tue, Mar 5, 2019 at 12:28 PM Павлухин Иван  wrote:

> Alexey,
>
> If problems arise in environments different from one where usual
> Ignite tests run then definitely it is a good idea to cover it. And
> testing other build kinds and in other environments is a good idea as
> well. But a particular problem with serialization and peer class
> loading is not clear for me. Why binary builds and Docker are needed
> there? Why multi JVM tests from Ignite testing framework cannot reveal
> mentioned problems?
>
> Ideally I think we should aggregate all failure reporting in common
> place. And for me TC bot is the best choice. Consequently it should be
> TeamCity most likely.
>
> But all in all I think we can give it a try according to you proposal
> and see how the things will go.
>
> вт, 5 мар. 2019 г. в 11:09, dmitrievanthony :
> >
> > Hi Alexey,
> >
> > I think it's a great idea. Travis + Docker is a very good and cheap
> > solution, so we could start with it. Regards the statistics, Travis
> allows
> > to check a last build status using a badge, so it also shouldn't be a
> > problem.
> >
> > Best regards,
> > Anton Dmitriev.
> >
> >
> >
> > --
> > Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
>
>
>
> --
> Best regards,
> Ivan Pavlukhin
>


[jira] [Created] (IGNITE-11478) [ML] Use new vectorizer API in Trainers

2019-03-05 Thread Alexey Platonov (JIRA)
Alexey Platonov created IGNITE-11478:


 Summary: [ML] Use new vectorizer API in Trainers
 Key: IGNITE-11478
 URL: https://issues.apache.org/jira/browse/IGNITE-11478
 Project: Ignite
  Issue Type: Improvement
  Components: ml
Reporter: Alexey Platonov
Assignee: Alexey Platonov


We should rewrite current trainers - exclude all "free"-feature/labels 
extractors from APIs and use new vectorizer in them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-11476) [ML] Use new feature extraction API in examples

2019-03-05 Thread Alexey Platonov (JIRA)
Alexey Platonov created IGNITE-11476:


 Summary: [ML] Use new feature extraction API in examples
 Key: IGNITE-11476
 URL: https://issues.apache.org/jira/browse/IGNITE-11476
 Project: Ignite
  Issue Type: Improvement
  Components: ml
Reporter: Alexey Platonov
Assignee: Alexey Platonov


Introduce new feature/label extraction API to all examples. These examples 
should work on binary builds without sharing additional jars to libs directory 
(except ml-jar).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-11477) [ML] Create tests for ML algorithms stability check against binary builds

2019-03-05 Thread Alexey Platonov (JIRA)
Alexey Platonov created IGNITE-11477:


 Summary: [ML] Create tests for ML algorithms stability check 
against binary builds 
 Key: IGNITE-11477
 URL: https://issues.apache.org/jira/browse/IGNITE-11477
 Project: Ignite
  Issue Type: Improvement
  Components: ml
Reporter: Alexey Platonov
Assignee: Alexey Platonov


After new feature API creation we should create tests for ML algorithms 
stability check against binary builds (or on other JVMs without common 
classpath). All new algorithms should be delivered with such test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-11475) [ML] Vectorizer API prototype with POC

2019-03-05 Thread Alexey Platonov (JIRA)
Alexey Platonov created IGNITE-11475:


 Summary: [ML] Vectorizer API prototype with POC 
 Key: IGNITE-11475
 URL: https://issues.apache.org/jira/browse/IGNITE-11475
 Project: Ignite
  Issue Type: Improvement
  Components: ml
Reporter: Alexey Platonov
Assignee: Alexey Platonov


We need to create a prototype of API for features/labels extraction and 
introduce it to one or two already existing examples. This prototype should 
show that new API works on binary builds.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-11474) Add possibility to run idle_verify in not idle cluster

2019-03-05 Thread Vladislav Pyatkov (JIRA)
Vladislav Pyatkov created IGNITE-11474:
--

 Summary: Add possibility to run idle_verify in not idle cluster
 Key: IGNITE-11474
 URL: https://issues.apache.org/jira/browse/IGNITE-11474
 Project: Ignite
  Issue Type: Improvement
Reporter: Vladislav Pyatkov


We are capable to make sort of READ_ONLY mode for blocking all data load.
Using this mode we should to add specific parameter for idle_verify, which 
exclude data load and after cluster switched to READ_ONLY continue the task.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Storing short/empty strings in Ignite

2019-03-05 Thread Ilya Kasnacheev
Hello!

If you can modify your code to store nulls instead of empty strings, nulls
seem to be much more compact.

Regards,
-- 
Ilya Kasnacheev


вт, 5 мар. 2019 г. в 10:12, Valentin Kulichenko <
valentin.kuliche...@gmail.com>:

> Hey folks,
>
> While working with Ignite users, I keep seeing data models where a single
> object (row) might contain many fields (100, 200, more...), and most of
> them are strings.
>
> Correct me if I'm wrong, but per my understanding, for every such field we
> store an integer value to represent its length. This is significant
> overhead - with 200 fields we spend 800 bytes only for this.
>
> Now here is the catch: vast majority of those strings are actually empty or
> very short (several chars), therefore we don't really need 4 bytes to their
> length.
>
> My suggestions is to introduce another data type, e.g. STRING_SHORT, use it
> for all strings that are 255 chars or less, and therefore use a single byte
> to encode length. We can go even further, and also introduce STRING_EMPTY,
> which obviously doesn't need any length information at all.
>
> What do you guys think?
>
> -Val
>


[jira] [Created] (IGNITE-11473) SQL: check convert to ENUM type by functions CAST, CONVERT throws sane exception

2019-03-05 Thread Taras Ledkov (JIRA)
Taras Ledkov created IGNITE-11473:
-

 Summary: SQL: check convert to ENUM type by functions CAST, 
CONVERT throws sane exception
 Key: IGNITE-11473
 URL: https://issues.apache.org/jira/browse/IGNITE-11473
 Project: Ignite
  Issue Type: Improvement
  Components: sql
Affects Versions: 2.7
Reporter: Taras Ledkov


CAST and CONVERT functions have the bug at the H2.
It is  fixed at H2 1.4.198.
We have to check that the functions throws sane  exception after H@ is upgraded.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (IGNITE-11472) SQL: throw sane exception for unsupported features

2019-03-05 Thread Taras Ledkov (JIRA)
Taras Ledkov created IGNITE-11472:
-

 Summary: SQL: throw sane exception for unsupported features
 Key: IGNITE-11472
 URL: https://issues.apache.org/jira/browse/IGNITE-11472
 Project: Ignite
  Issue Type: Improvement
  Components: sql
Affects Versions: 2.7
Reporter: Taras Ledkov



|| Feature || Issue || Comments ||
| WITH RECURSIVE | IGNITE-7664 |  can be fixed immediately |
| DEFAULT value in the INSERT / MERGE | IGNITE-7664 |  can be fixed immediately 
|
| MEMORY, TEMPORARY, HIDDEN table types for CREATE TABLE | IGNITE-7664 |  can 
be fixed immediately |
| FIRST column position for ALTER TABLE ADD COLUMN  | IGNITE-7664 |  can be 
fixed immediately |
| HELP / SHOW commands | IGNITE-7664 |  can be fixed immediately |
| GRANT / REVOKE commands | IGNITE-7664 |  can be fixed immediately |
| TIMESTAMP WITH TIME ZONE unsupported type |  IGNITE-7664 |  can be fixed 
immediately |
| ENUM unsupported type |  IGNITE-7664 | partially fixed, CAST and CONVERT 
function has the bug at the H2 fixed at 1.4.198 |
| MERGE USING |  IGNITE-11444  | cannot be fixed without patch to H2 |




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Tests for ML using binary builds

2019-03-05 Thread Павлухин Иван
Alexey,

If problems arise in environments different from one where usual
Ignite tests run then definitely it is a good idea to cover it. And
testing other build kinds and in other environments is a good idea as
well. But a particular problem with serialization and peer class
loading is not clear for me. Why binary builds and Docker are needed
there? Why multi JVM tests from Ignite testing framework cannot reveal
mentioned problems?

Ideally I think we should aggregate all failure reporting in common
place. And for me TC bot is the best choice. Consequently it should be
TeamCity most likely.

But all in all I think we can give it a try according to you proposal
and see how the things will go.

вт, 5 мар. 2019 г. в 11:09, dmitrievanthony :
>
> Hi Alexey,
>
> I think it's a great idea. Travis + Docker is a very good and cheap
> solution, so we could start with it. Regards the statistics, Travis allows
> to check a last build status using a badge, so it also shouldn't be a
> problem.
>
> Best regards,
> Anton Dmitriev.
>
>
>
> --
> Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/



-- 
Best regards,
Ivan Pavlukhin


Re: Tests for ML using binary builds

2019-03-05 Thread dmitrievanthony
Hi Alexey,

I think it's a great idea. Travis + Docker is a very good and cheap
solution, so we could start with it. Regards the statistics, Travis allows
to check a last build status using a badge, so it also shouldn't be a
problem.

Best regards,
Anton Dmitriev.



--
Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/


Re: Storing short/empty strings in Ignite

2019-03-05 Thread Vladimir Ozerov
Hi Val,

I would say that we do not need string length at all, because it can be
derived from object footer (next field offset MINUS current field offset).
It is not very good idea to implement proposed change in Apache Ignite 2.x
because it is breaking and will add unnecessary complexity to already very
complex binary infrastructure. Instead, it is better to review binary
format in 3.0 and remove length's not only from Strings, but from other
variable-length data types as well (arrays, decimals).

On Tue, Mar 5, 2019 at 10:12 AM Valentin Kulichenko <
valentin.kuliche...@gmail.com> wrote:

> Hey folks,
>
> While working with Ignite users, I keep seeing data models where a single
> object (row) might contain many fields (100, 200, more...), and most of
> them are strings.
>
> Correct me if I'm wrong, but per my understanding, for every such field we
> store an integer value to represent its length. This is significant
> overhead - with 200 fields we spend 800 bytes only for this.
>
> Now here is the catch: vast majority of those strings are actually empty or
> very short (several chars), therefore we don't really need 4 bytes to their
> length.
>
> My suggestions is to introduce another data type, e.g. STRING_SHORT, use it
> for all strings that are 255 chars or less, and therefore use a single byte
> to encode length. We can go even further, and also introduce STRING_EMPTY,
> which obviously doesn't need any length information at all.
>
> What do you guys think?
>
> -Val
>