date:20131011

Re: [ANN] Hivemall: Hive scalable machine learning library

2013-10-11 Thread Makoto YUI


Hi,

I added support for the-state-of-the-art classifiers (those are not yet 
supported in Mahout) and Hivemall's cute(!?) logo as well in Hivemall 
0.1-rc3.


Newly supported classifiers include
- Confidence Weighted (CW)
- Adaptive Regularization of Weight Vectors (AROW)
- Soft Confidence Weighted (SCW1, SCW2)

Those classifiers are much smart comparing to the standard SGD-based or 
passive aggressive classifiers. Please check it out by yourself.


Thanks,
Makoto

(2013/10/11 4:28), Clark Yang (杨卓荦) wrote:

I looks really cool, I think I will try it on.

Cheers,
Zhuoluo (Clark) Yang


2013/10/5 Makoto YUI yuin...@gmail.com mailto:yuin...@gmail.com

Hi Edward,

Thank you for your interst.

Hivemall project does not have a plan to have a specific mailing
list, I will answer following questions/comments on twitter or
through Github issues (with a question label).

BTW, I just added a CTR (Click-Through-Rate) prediction example that is
provided by a commercial search engine provider for the KDDCup 2012
track 2.

https://github.com/myui/__hivemall/wiki/KDDCup-2012-__track-2-CTR-prediction-dataset

https://github.com/myui/hivemall/wiki/KDDCup-2012-track-2-CTR-prediction-dataset

I guess many of you working on ad CTR/CVR predictions. This example
might be some help understanding how to do it only within Hive.

Thanks,
Makoto @myui


(2013/10/04 23:02), Edward Capriolo wrote:

Looks cool im already starting to play with it.

On Friday, October 4, 2013, Makoto Yui yuin...@gmail.com
mailto:yuin...@gmail.com
mailto:yuin...@gmail.com mailto:yuin...@gmail.com wrote:
   Hi Dean,
  
   Thank you for your interest in Hivemall.
  
   Twitter's paper actually influenced me in developing
Hivemall and I
   initially implemented such functionality as Pig UDFs.
  
   Though my Pig ML library is not released, you can find a similar
   attempt for Pig in
   https://github.com/y-tag/java-__pig-MyUDFs
https://github.com/y-tag/java-pig-MyUDFs
  
   Thanks,
   Makoto
  
   2013/10/3 Dean Wampler deanwamp...@gmail.com
mailto:deanwamp...@gmail.com
mailto:deanwamp...@gmail.com mailto:deanwamp...@gmail.com__:

   This is great news! I know that Twitter has done something
similar
with UDFs
   for Pig, as described in this paper:
  

http://www.umiacs.umd.edu/~__jimmylin/publications/Lin___Kolcz_SIGMOD2012.pdf

http://www.umiacs.umd.edu/%7Ejimmylin/publications/Lin_Kolcz_SIGMOD2012.pdf

http://www.umiacs.umd.edu/%__7Ejimmylin/publications/Lin___Kolcz_SIGMOD2012.pdf

http://www.umiacs.umd.edu/%7Ejimmylin/publications/Lin_Kolcz_SIGMOD2012.pdf

  
   I'm glad to see the same thing start with Hive.
  
   Dean
  
  
   On Wed, Oct 2, 2013 at 10:21 AM, Makoto YUI
yuin...@gmail.com mailto:yuin...@gmail.com
mailto:yuin...@gmail.com mailto:yuin...@gmail.com wrote:
  
   Hello all,
  
   My employer, AIST, has given the thumbs up to open source
our machine
   learning library, named Hivemall.
  
   Hivemall is a scalable machine learning library running on
Hive/Hadoop,
   licensed under the LGPL 2.1.
  
   https://github.com/myui/__hivemall
https://github.com/myui/hivemall
  
   Hivemall provides machine learning functionality as well
as feature
   engineering functions through UDFs/UDAFs/UDTFs of Hive. It
is designed
   to be scalable to the number of training instances as well
as the
number
   of training features.
  
   Hivemall is very easy to use as every machine learning
step is done
   within HiveQL.
  
   -- Installation is just as follows:
   add jar /tmp/hivemall.jar;
   source /tmp/define-all.hive;
  
   -- Logistic regression is performed by a query.
   SELECT
 feature,
 avg(weight) as weight
   FROM
(SELECT logress(features,label) as (feature,weight) FROM
   training_features) t
   GROUP BY feature;
  
   You can find detailed examples on our wiki pages.
   https://github.com/myui/__hivemall/wiki/_pages
https://github.com/myui/hivemall/wiki/_pages
  
   Though we consider that Hivemall is much easier to use and
more
scalable
   than Mahout for classification/regression tasks, please
check it by
   yourself. If you have a Hive environment, you can evaluate
Hivemall
   within 5 minutes or so.
  
   Hope you enjoy the

NullPointerException on Sample Tables / CDH 4.4

2013-10-11 Thread fab wol

hey everyone,

I've got supplied with a decent ten node CDH 4.4 cluster, only 7 days old,
and someone tried some HBase stuff on it. I wanted to apply my (on another
cluster working) workflow's to that cluster (consisting of HiveQL Scripts
and Oozie Workflows) but unfortunately i the following issue: When trying
to drop some select statement on any table (sample tables or tables
regarding my workflow) i get the following error stack (from beeswax and
from Hive CLI):

java.io.IOException:
org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException):
java.lang.NullPointerException
at 
org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.sortLocatedBlocks(DatanodeManager.java:334)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1245)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:413)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:172)
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44938)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1751)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1747)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1745)


I tried already setting up a new table (blank) and inserting some
dummy data, but the error keeps the same. Some connection between HDFS
and Hive regarind block information seems to be broken. Any idea how
to fix this or where the configuration is to get rid of this or what
other things we should check?

Cheers

Fabian

Re: NullPointerException on Sample Tables / CDH 4.4

2013-10-11 Thread fab wol

somehow after three days of searching, i just deployed client configuration
for all hive roles again, and the error seems to be gone. lets see what the
future brings.

cheers


2013/10/11 fab wol darkwoll...@gmail.com

 hey everyone,

 I've got supplied with a decent ten node CDH 4.4 cluster, only 7 days old,
 and someone tried some HBase stuff on it. I wanted to apply my (on another
 cluster working) workflow's to that cluster (consisting of HiveQL Scripts
 and Oozie Workflows) but unfortunately i the following issue: When trying
 to drop some select statement on any table (sample tables or tables
 regarding my workflow) i get the following error stack (from beeswax and
 from Hive CLI):

 java.io.IOException: 
 org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): 
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.sortLocatedBlocks(DatanodeManager.java:334)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1245)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:413)
   at 
 org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:172)
   at 
 org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44938)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1751)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1747)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1745)


 I tried already setting up a new table (blank) and inserting some dummy data, 
 but the error keeps the same. Some connection between HDFS and Hive regarind 
 block information seems to be broken. Any idea how to fix this or where the 
 configuration is to get rid of this or what other things we should check?

 Cheers

 Fabian

Re: [ANN] Hivemall: Hive scalable machine learning library

2013-10-11 Thread Nitin Pawar

Just tried this for some hot trends in forum managements. Was pretty
impressive.

I will try this more deeply and if possible integrate in my product.

Thanks for the awesome work.

Nitin

On Fri, Oct 11, 2013 at 12:58 PM, Makoto YUI yuin...@gmail.com wrote:

Hi,

I added support for the-state-of-the-art classifiers (those are not yet
supported in Mahout) and Hivemall's cute(!?) logo as well in Hivemall
0.1-rc3.

Newly supported classifiers include
- Confidence Weighted (CW)
- Adaptive Regularization of Weight Vectors (AROW)
- Soft Confidence Weighted (SCW1, SCW2)

Those classifiers are much smart comparing to the standard SGD-based or
passive aggressive classifiers. Please check it out by yourself.

Thanks,
Makoto

(2013/10/11 4:28), Clark Yang (杨卓荦) wrote:

I looks really cool, I think I will try it on.

Cheers,
Zhuoluo (Clark) Yang

2013/10/5 Makoto YUI yuin...@gmail.com mailto:yuin...@gmail.com

Hi Edward,

Thank you for your interst.

Hivemall project does not have a plan to have a specific mailing
list, I will answer following questions/comments on twitter or
through Github issues (with a question label).

BTW, I just added a CTR (Click-Through-Rate) prediction example that
is
provided by a commercial search engine provider for the KDDCup 2012
track 2.
https://github.com/myui/__**hivemall/wiki/KDDCup-2012-__**
track-2-CTR-prediction-datasethttps://github.com/myui/__hivemall/wiki/KDDCup-2012-__track-2-CTR-prediction-dataset

https://github.com/myui/**hivemall/wiki/KDDCup-2012-**
track-2-CTR-prediction-datasethttps://github.com/myui/hivemall/wiki/KDDCup-2012-track-2-CTR-prediction-dataset
**

I guess many of you working on ad CTR/CVR predictions. This example
might be some help understanding how to do it only within Hive.

Thanks,
Makoto @myui

(2013/10/04 23:02), Edward Capriolo wrote:

Looks cool im already starting to play with it.

On Friday, October 4, 2013, Makoto Yui yuin...@gmail.com
mailto:yuin...@gmail.com
mailto:yuin...@gmail.com mailto:yuin...@gmail.com wrote:
Hi Dean,

Thank you for your interest in Hivemall.

Twitter's paper actually influenced me in developing
Hivemall and I
initially implemented such functionality as Pig UDFs.

Though my Pig ML library is not released, you can find a
similar
attempt for Pig in

https://github.com/y-tag/java-**__pig-MyUDFshttps://github.com/y-tag/java-__pig-MyUDFs

https://github.com/y-tag/**java-pig-MyUDFshttps://github.com/y-tag/java-pig-MyUDFs

Thanks,
Makoto

2013/10/3 Dean Wampler deanwamp...@gmail.com
mailto:deanwamp...@gmail.com
mailto:deanwamp...@gmail.com mailto:deanwamp...@gmail.com**
__:

This is great news! I know that Twitter has done something
similar
with UDFs
for Pig, as described in this paper:

http://www.umiacs.umd.edu/~__**jimmylin/publications/Lin___**
Kolcz_SIGMOD2012.pdfhttp://www.umiacs.umd.edu/~__jimmylin/publications/Lin___Kolcz_SIGMOD2012.pdf
http://www.umiacs.umd.edu/%**7Ejimmylin/publications/Lin_**
Kolcz_SIGMOD2012.pdfhttp://www.umiacs.umd.edu/%7Ejimmylin/publications/Lin_Kolcz_SIGMOD2012.pdf

http://www.umiacs.umd.edu/%__**7Ejimmylin/publications/Lin___**
Kolcz_SIGMOD2012.pdf

http://www.umiacs.umd.edu/%**7Ejimmylin/publications/Lin_**
Kolcz_SIGMOD2012.pdfhttp://www.umiacs.umd.edu/%7Ejimmylin/publications/Lin_Kolcz_SIGMOD2012.pdf

I'm glad to see the same thing start with Hive.

Dean

On Wed, Oct 2, 2013 at 10:21 AM, Makoto YUI
yuin...@gmail.com mailto:yuin...@gmail.com
mailto:yuin...@gmail.com mailto:yuin...@gmail.com wrote:

Hello all,

My employer, AIST, has given the thumbs up to open source
our machine
learning library, named Hivemall.

Hivemall is a scalable machine learning library running on
Hive/Hadoop,
licensed under the LGPL 2.1.

https://github.com/myui/__**hivemallhttps://github.com/myui/__hivemall

https://github.com/myui/**hivemallhttps://github.com/myui/hivemall

Hivemall provides machine learning functionality as well
as feature
engineering functions through UDFs/UDAFs/UDTFs of Hive. It
is designed
to be scalable to the number of training instances as well
as the
number
of training features.

Hivemall is very easy to use as every machine learning
step is done
within HiveQL.

Re: NPE org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable

2013-10-11 Thread xinyan Yang

Development environment，hive 0.11、hadoop 1.0.3


2013/10/11 xinyan Yang moon.yan...@gmail.com

 Hi,
 when i run this sql,it fails,can anyone give me a advise

 
 select e.udid as udid,e.app_id as app_id
 from acorn_3g.ClientChannelDefine cc
 join (
 select udid,app_id,from_id
 from (
  select u.device_id as udid,u.app_id as app_id,g.device_id as
 3gdid,u.from_id as from_id from acorn_3g.user_device_info u
 left outer join (select device_id from acorn_3g.3g_device_id where
 log_date'2013-09-15') g
  on u.device_id=g.device_id where u.log_date='2013-09-15' and
 u.from_id0 and u.type=1) f1
 where 3gdid is null ) e
 on(e.from_id=cc.from_id)

 

 error info:
 Task with the most failures(4):
 -
 Task ID:
   task_201305281414_236693_m_01

 URL:

 http://YZSJHL18-22.opi.com:50030/taskdetails.jsp?jobid=job_201305281414_236693tipid=task_201305281414_236693_m_01http://yzsjhl18-22.opi.com:50030/taskdetails.jsp?jobid=job_201305281414_236693tipid=task_201305281414_236693_m_01
 -
 Diagnostic Messages for this Task:
 java.lang.RuntimeException:
 org.apache.hadoop.hive.ql.metadata.HiveException:
 java.lang.NullPointerException
 at
 org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:162)
 at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
 at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
 at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
 at org.apache.hadoop.mapred.Child.main(Child.java:249)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException:
 java.lang.NullPointerException
 at
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:198)
 at
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.cleanUpInputFileChangedOp(MapJoinOperator.java:212)
 at
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1377)
 at
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1381)
 at
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1381)
 at
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:611)
 at
 org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:144)
 ... 8 more
 Caused by: java.lang.NullPointerException
 at
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:186)
 ... 14 more


 FAILED: Execution Error, return code 2 from
 org.apache.hadoop.hive.ql.exec.MapRedTask
 MapReduce Jobs Launched:
 Job 0: Map: 343  Reduce: 2   Cumulative CPU: 3478.61 sec   HDFS Read:
 1862106687 HDFS Write: 3838425 SUCCESS
 Job 1: Map: 2   HDFS Read: 0 HDFS Write: 0 FAIL

Re: NPE org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable

2013-10-11 Thread Yin Huai

Hello Xinyang,

Can you attach the query plan (the output of EXPLAIN)? I think a bad plan
caused the error.

Also, can you try hive trunk? Looks like it is a bug fixed after the
release of 0.11.

Thanks,

Yin


On Fri, Oct 11, 2013 at 9:21 AM, xinyan Yang moon.yan...@gmail.com wrote:

 Development environment，hive 0.11、hadoop 1.0.3


 2013/10/11 xinyan Yang moon.yan...@gmail.com

 Hi,
 when i run this sql,it fails,can anyone give me a advise

 
 select e.udid as udid,e.app_id as app_id
 from acorn_3g.ClientChannelDefine cc
 join (
 select udid,app_id,from_id
 from (
  select u.device_id as udid,u.app_id as app_id,g.device_id as
 3gdid,u.from_id as from_id from acorn_3g.user_device_info u
 left outer join (select device_id from acorn_3g.3g_device_id where
 log_date'2013-09-15') g
  on u.device_id=g.device_id where u.log_date='2013-09-15' and
 u.from_id0 and u.type=1) f1
 where 3gdid is null ) e
 on(e.from_id=cc.from_id)

 

 error info:
 Task with the most failures(4):
 -
 Task ID:
   task_201305281414_236693_m_01

 URL:

 http://YZSJHL18-22.opi.com:50030/taskdetails.jsp?jobid=job_201305281414_236693tipid=task_201305281414_236693_m_01http://yzsjhl18-22.opi.com:50030/taskdetails.jsp?jobid=job_201305281414_236693tipid=task_201305281414_236693_m_01
 -
 Diagnostic Messages for this Task:
 java.lang.RuntimeException:
 org.apache.hadoop.hive.ql.metadata.HiveException:
 java.lang.NullPointerException
 at
 org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:162)
 at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
 at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
 at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
 at org.apache.hadoop.mapred.Child.main(Child.java:249)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException:
 java.lang.NullPointerException
 at
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:198)
 at
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.cleanUpInputFileChangedOp(MapJoinOperator.java:212)
 at
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1377)
 at
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1381)
 at
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1381)
 at
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:611)
 at
 org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:144)
 ... 8 more
 Caused by: java.lang.NullPointerException
 at
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:186)
 ... 14 more


 FAILED: Execution Error, return code 2 from
 org.apache.hadoop.hive.ql.exec.MapRedTask
 MapReduce Jobs Launched:
 Job 0: Map: 343  Reduce: 2   Cumulative CPU: 3478.61 sec   HDFS Read:
 1862106687 HDFS Write: 3838425 SUCCESS
 Job 1: Map: 2   HDFS Read: 0 HDFS Write: 0 FAIL

hive partition pruning on joining on partition column

2013-10-11 Thread java8964 java8964

I have the requirement trying to support in hive, not sure if it is doable.
I have the hadoop 1.1.1 with Hive 0.9.0 (Using deby as the meta store)
If I partition my data by a dt column, so if my table 'foo' have some 
partitions like 'dt=2013-07-01' to 'dt=2013-07-30'.
Now the user want to query all the data of Saturday only.
To make it flexiable, instead of asking end user to find out what date in that 
month are Saturday, I add a lookup table (just called it 'bar') in the HIVE 
with following columns:
year, month, day, dt_format, week_of_day
So I want to see if I can join with foo and bar to still get the partition 
pruning:
select *from foojoin baron (bar.year=2013 and bar.month=7 and bar.day_of_week=6 
and bar.dt_foramt = foo.dt)
I tried several ways, like switch the table order, join with subquery etc, none 
of them will make partition pruning works in this case on table foo. 
Can this really archivable in hive?
Thanks
Yong

Re: hive partition pruning on joining on partition column

2013-10-11 Thread Nitin Pawar

one easiest way to do this is
create a table where each date maps to week of month, week of year, day of
week, day of month and then do the join on just date and put the conditions
on where clause.

Its easy to manipulate the date column for my understanding and you can
join just based on date and  get results based on where conditions.


PS: this is what we currently do where we have to do continuous rollup
analytics  for yeat to date or parameter to date calculations.
Wait for others to give you better solutions,


On Fri, Oct 11, 2013 at 10:35 PM, java8964 java8964 java8...@hotmail.comwrote:

 I have the requirement trying to support in hive, not sure if it is doable.

 I have the hadoop 1.1.1 with Hive 0.9.0 (Using deby as the meta store)

 If I partition my data by a dt column, so if my table 'foo' have some
 partitions like 'dt=2013-07-01' to 'dt=2013-07-30'.

 Now the user want to query all the data of Saturday only.

 To make it flexiable, instead of asking end user to find out what date in
 that month are Saturday, I add a lookup table (just called it 'bar') in the
 HIVE with following columns:

 year, month, day, dt_format, week_of_day

 So I want to see if I can join with foo and bar to still get the partition
 pruning:

 select *
 from foo
 join
 bar
 on (bar.year=2013 and bar.month=7 and bar.day_of_week=6 and bar.dt_foramt
 = foo.dt)

 I tried several ways, like switch the table order, join with subquery etc,
 none of them will make partition pruning works in this case on table foo.

 Can this really archivable in hive?

 Thanks

 Yong




-- 
Nitin Pawar

Re: [ANN] Hivemall: Hive scalable machine learning library

NullPointerException on Sample Tables / CDH 4.4

Re: NullPointerException on Sample Tables / CDH 4.4

Re: [ANN] Hivemall: Hive scalable machine learning library

Re: NPE org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable

Re: NPE org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable

hive partition pruning on joining on partition column

Re: hive partition pruning on joining on partition column

8 matches

Site Navigation

Mail list logo

Footer information