Re: HDP, Hive + Ignite

Ivan V. Mon, 24 Apr 2017 11:09:58 -0700

p.s. Please use a HadoopFileSystemFactory in secondary file system config,
as described there
https://apacheignite-fs.readme.io/docs/installing-on-hortonworks-hdp ,
constructor org.apache.ignite.hadoop.fs.IgniteHadoopIgfsSecondaryFileSystem(2)
is deprecated.


On Mon, Apr 24, 2017 at 7:44 PM, Ivan V. <iveselovs...@gridgain.com> wrote:

> Hi, Aloha,
> First of all, Hadoop Accelerator consists of 2 parts that are independent
> and can be used one without the other: (1) IGFS and (2) map-reduce
> execution engine.
>
> IGFS is not used in your case because default file system in your cluster
> is still hdfs://.... (specified by global property "fs.default.name").
> The 2 properties you set (*.igfs.impl=..) define the IGFS implementation
> classes, but they come into play only when igfs:// schema encounters.
> Idea to set fs.default.name=igfs://myhost:10500/ is not so good as it may
> appear, because HDFS daemons (namenode, datanode) cannot run with such
> property value, while you probably need HDFS as the underlying (secondary)
> file system.
>
> So, to use IGFS you should either use explicit URI with igfs:// schema as
> you do in your example above "hadoop fs -ls igfs:///user/hive", or try to
> instruct Hive to use igfs as default property, like this:
> hive-1.2/bin/beeline \
> --hiveconf fs.default.name=igfs://myhost:10500/ \
> --hiveconf hive.rpc.query.plan=true \
> --hiveconf mapreduce.framework.name=ignite \
> --hiveconf mapreduce.jobtracker.address=myhost:11211 -u jdbc:hive2://
> 127.0.0.1:10000
>
> Also , in order to use Ignite Map-Reduce engine with Hive,  in HDP 2.4+
> the Hive execution engine (property "hive.execution.engine") should
> explicitly be set to "mr", because the default value is different.
>
> On Mon, Apr 24, 2017 at 3:09 PM, <al...@74.ru> wrote:
>
>> Hi,
>>
>> I have a cluster HDP 2.6 (High Available, 8 nodes) and like to try using
>> Hive+Orc+Tez with Ignite. I guess I should use IFGS as cache layer for HDFS.
>> I installed Hadoop Accelerator  1.9 on all cluster nodes and run one
>> ignite-node on every cluster node.
>>
>> I added these settings using Ambari  and then restarted HDFS, MapReduce,
>> Yarn, Hive.
>> HDFS, add 2 new properties to Custom core-site
>> fs.igfs.impl=org.apache.ignite.hadoop.fs.v1.IgniteHadoopFileSystem
>> fs.AbstractFileSystem.igfs.impl=org.apache.ignite.hadoop.fs.
>> v2.IgniteHadoopFileSystem
>>
>> Mapred, Custom mapred-site
>> mapreduce.framework.name=ignite
>> mapreduce.jobtracker.address=dev-nn1:11211
>>
>> Hive, Custom hive-site
>> hive.rpc.query.plan=true
>>
>> Now I can get access to HDFS through IGFS
>> hadoop fs -ls igfs:///user/hive
>> Found 3 items
>> drwx------  - hive hdfs          0 2017-04-19 21:00
>> igfs:///user/hive/.Trash
>> drwxr-xr-x  - hive hdfs          0 2017-04-19 10:07
>> igfs:///user/hive/.hiveJars
>> drwx------  - hive hdfs          0 2017-04-22 14:27
>> igfs:///user/hive/.staging
>>
>> I thought that Hive read data from HDFS first time and then read the same
>> data from IFGS.
>> But when I run HIVE (cli or beeline) it still reads data from HDFS (I
>> tried a few times), in igniteVisor "Avg. free heap" remains the same
>> before/during/after running query (about 80%).
>> What is wrong? May be I should load data to IFGS manually for every query?
>
>
>

Re: HDP, Hive + Ignite

Reply via email to