Re: Ignite ML withKeepBinary cache

2019-01-21 Thread otorreno
Thanks Ilya,

I got the link to the JIRA ticket in the Ignite Devs mailing list. In fact I
already included a comment in the ticket.

I got a response from Alexey Zinoviev in the Dev list too, and I am now
waiting to receive a further update from him on this matter.

Regards,



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Ignite ML withKeepBinary cache

2019-01-01 Thread otorreno
Hi everyone,

After the new release (2.7.0), I have been playing around with the machine
learning algorithms a bit.
We have some data in a cache created with the "withKeepBinary()" option, and
I wanted
to test if the machine learning algos would work with such a cache. I tried,
but it fails with the following stacktrace:

org.apache.ignite.IgniteException: testType
at
org.apache.ignite.internal.processors.closure.GridClosureProcessor$C2.execute(GridClosureProcessor.java:1858)
at
org.apache.ignite.internal.processors.job.GridJobWorker$2.call(GridJobWorker.java:568)
at
org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:6816)
at
org.apache.ignite.internal.processors.job.GridJobWorker.execute0(GridJobWorker.java:562)
at
org.apache.ignite.internal.processors.job.GridJobWorker.body(GridJobWorker.java:491)
at
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.ignite.binary.BinaryInvalidTypeException: testType
at
org.apache.ignite.internal.binary.BinaryContext.descriptorForTypeId(BinaryContext.java:707)
at
org.apache.ignite.internal.binary.BinaryReaderExImpl.deserialize0(BinaryReaderExImpl.java:1757)
at
org.apache.ignite.internal.binary.BinaryReaderExImpl.deserialize(BinaryReaderExImpl.java:1716)
at
org.apache.ignite.internal.binary.BinaryObjectImpl.deserializeValue(BinaryObjectImpl.java:798)
at
org.apache.ignite.internal.binary.BinaryObjectImpl.value(BinaryObjectImpl.java:143)
at
org.apache.ignite.internal.processors.cache.CacheObjectUtils.unwrapBinary(CacheObjectUtils.java:177)
at
org.apache.ignite.internal.processors.cache.CacheObjectUtils.unwrapBinaryIfNeeded(CacheObjectUtils.java:39)
at
org.apache.ignite.internal.processors.cache.query.GridCacheQueryManager$ScanQueryIterator.advance(GridCacheQueryManager.java:3063)
at
org.apache.ignite.internal.processors.cache.query.GridCacheQueryManager$ScanQueryIterator.onHasNext(GridCacheQueryManager.java:2965)
at
org.apache.ignite.internal.util.GridCloseableIteratorAdapter.hasNextX(GridCloseableIteratorAdapter.java:53)
at
org.apache.ignite.internal.util.lang.GridIteratorAdapter.hasNext(GridIteratorAdapter.java:45)
at
org.apache.ignite.ml.dataset.impl.cache.util.ComputeUtils.computeCount(ComputeUtils.java:313)
at
org.apache.ignite.ml.dataset.impl.cache.util.ComputeUtils.computeCount(ComputeUtils.java:300)
at
org.apache.ignite.ml.dataset.impl.cache.util.ComputeUtils.lambda$initContext$9b68d858$1(ComputeUtils.java:222)
at
org.apache.ignite.ml.dataset.impl.cache.util.ComputeUtils.lambda$affinityCallWithRetries$b46c4136$1(ComputeUtils.java:90)
at
org.apache.ignite.internal.processors.closure.GridClosureProcessor$C2.execute(GridClosureProcessor.java:1855)
... 8 common frames omitted
Caused by: java.lang.ClassNotFoundException: testType
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:338)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at
org.apache.ignite.internal.util.IgniteUtils.forName(IgniteUtils.java:8771)
at
org.apache.ignite.internal.MarshallerContextImpl.getClass(MarshallerContextImpl.java:349)
at
org.apache.ignite.internal.binary.BinaryContext.descriptorForTypeId(BinaryContext.java:698)
... 23 common frames omitted

Debugging, I found the source of the error, at some point you are just
taking the
name of the upstreamCache (where the data resides), and creating a new
IgniteCache
object using such name before copying the data to a dataset cache. However,
you
are not using the keepBinary property of the original cache. I hardcoded the
"withKeepBinary()" to the following lines:
https://github.com/apache/ignite/blob/2.7.0/modules/ml/src/main/java/org/apache/ignite/ml/dataset/impl/cache/util/ComputeUtils.java#L162
https://github.com/apache/ignite/blob/2.7.0/modules/ml/src/main/java/org/apache/ignite/ml/dataset/impl/cache/util/ComputeUtils.java#L215
https://github.com/apache/ignite/blob/2.7.0/modules/ml/src/main/java/org/apache/ignite/ml/dataset/impl/cache/CacheBasedDatasetBuilder.java#L99

The previous made it work. I tried to retrieve the keep binary property from
the
upstreamCache, but I was not able to find the right method to obtain it (I
saw the property is
stored in the operation context field (opCtx), but it is private and cannot
be
accessed from the lines I modified)

My example code is available at:
https://gist.github.com/otorreno/ca6c5347c1bbde2d4fedd02b51d02cbb

Any plans on making the machine lear

Re: S3AFileSystem as IGFS secondary file system

2018-07-02 Thread otorreno
I have been able to do it using the following lines:
BasicHadoopFileSystemFactory f = new BasicHadoopFileSystemFactory();
f.setConfigPaths("cfg.xml");

IgniteHadoopIgfsSecondaryFileSystem sec = new
IgniteHadoopIgfsSecondaryFileSystem();
sec.setFileSystemFactory(f);

fileSystemCfg.setSecondaryFileSystem(sec);
fileSystemCfg.setDefaultMode(IgfsMode.DUAL_ASYNC);

The "cfg.xml" file contains the S3 access and secret keys, and the bucket
URI. However, I would like to set the configuration in the code not in a
configuration file. Taking a look at the BasicHadoopFileSystemFactory class
you can only specify a file path. Is there any reason to not allow passing a
Hadoop Configuration instance?

Best,
Oscar



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


S3AFileSystem as IGFS secondary file system

2018-07-02 Thread otorreno
Hi, 

I am struggling to get the S3AFileSystem configured as an IGFS secondary 
file system. 

I am using IGFS as my default file system, and do not want to have an HDFS 
cluster up and running besides the IGFS one. 

I have been able to reproduce the steps contained at 
https://apacheignite-fs.readme.io/docs/secondary-file-system BUT that's not 
the behaviour I am looking for. 

What I want to do is having an instance of S3AFileSystem, which is an 
implementation of the Hadoop FileSystem, configure IGFS to use it as 
secondary file system. 

Is it possible? 

Best, 
Oscar 



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


S3AFileSystem as IGFS secondary file system

2018-07-02 Thread otorreno
Hi,

I am struggling to get the S3AFileSystem configured as an IGFS secondary
file system.

I am using IGFS as my default file system, and do not want to have an HDFS
cluster up and running besides the IGFS one.

I have been able to reproduce the steps contained at
https://apacheignite-fs.readme.io/docs/secondary-file-system BUT that's not
the behaviour I am looking for.

What I want to do is having an instance of S3AFileSystem, which is an
implementation of the Hadoop FileSystem, configure IGFS to use it as
secondary file system.

Is it possible?

Best,
Oscar



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/