We are writing ORC files in our application for hive to consume.
Given enough time, we have noticed that writing causes a NPE when
working with a string column's stats. Not sure whats causing it on
our side yet since replaying the same data is just fine, it seems more
like this just happens over t
Hi Gurus,
I am facing ParseException for the following code for my UDF in hive. just the
evaluate method.
private final SimpleDateFormat sdf = new SimpleDateFormat("dd-MM-",
Locale.US);
public Object evaluate(DeferredObject[] arguments) throws HiveException {
String result = "0";
+ dev mail list
The original correlation optimization might be designed for mr engine. But
similar optimization could be applied for tez too. Is there any existing
jira to track that ?
On Tue, Sep 1, 2015 at 1:58 PM, Jeff Zhang wrote:
> Hi Pengcheng,
>
> Is there reason why the correlation o
I see from the docs that QueryPlan can be serialized to string using the
toThriftJSONString() function.
How do de-serialize it ? Any pointers would be helpful.
Thanks,
Raajay
On Tue, Sep 1, 2015 at 11:26 AM, Raajay wrote:
> Hi Canan,
>
> The changes that I am primarily interested are:
>
> a. A
Seems Hive 1.2 fixed this issue. But not sure what is the JIRA related and
the possibility to backport this fix into Hive 0.13?
On Tue, Sep 1, 2015 at 5:35 PM, Jim Green wrote:
> Hi Team,
>
> Below is the minimum reproduce of wrong results in Hive 0.13:
>
> *1. Create 4 tables*
> CREATE EXTERNA
Hi Team,
Below is the minimum reproduce of wrong results in Hive 0.13:
*1. Create 4 tables*
CREATE EXTERNAL TABLE testjoin1( joincol string );
CREATE EXTERNAL TABLE testjoin2(
anothercol string ,
joincol string);
CREATE EXTERNAL TABLE testjoin3( anothercol string);
CREATE EXTERNAL TABLE t
hi Group,
default value of hive.exec.max.created.files is 100,000.
I have huge table without partitions.
I want to remove duplicates by using "group by" and insert into a new
partitioned table with dynamic partition.
exception:
[Fatal Error] total number of created files now is 101859, which e
Hi Canan,
The changes that I am primarily interested are:
a. Altering the parallelism of the DAG
b. Change task location hints etc..
In general, I want to make these alterations and run the DAGs on tez,
without having to go through the hive pipeline.
Raajay
On Mon, Aug 31, 2015 at 11:42 PM, ca
Hi All,
I am trying to configure hive 1.2.1 to support concurrency. In
order to do so a zookeeper has to be running and a remote metastore has to be
used . Is there a good doc that I can follow to configure both hive and
zookeeper to support concurrency? Thank you very much.