Re: [DISCUSSION] Support only spark 2 in carbon 1.3.0

2017-10-09 Thread manish gupta
+1

Regards
Manish Gupta

On Mon, 9 Oct 2017 at 5:11 PM, Suprith T Jain  wrote:

> +1
>
> On 09-Oct-2017 7:27 AM, "Lu Cao"  wrote:
>
> > Hi community,
> > Currently we have three spark related module in carbondata(spark 1.5,
> 1.6,
> > 2.1), the project has become more and more difficult to maintain and has
> > many redundant code.
> > I propose to stop supporting spark 1.5 &1.6 and focus on spark 2.1(2.2).
> > That will keep the project clean and simple for maintenance.
> > Maybe we can provide some key patch to old version. But new features
> could
> > support spark2 only.
> > Any ideas?
> >
> >
> > Thanks & Regards,
> > Lionel Cao
> >
>


Re: DataMap Interface requires `IndexColumns` as Input

2017-10-09 Thread Ravindra Pesala
Hi,

Indexed columns on which datamap is created is present in DataMapFactory.
You can check getMeta method.  By using the filter expression tree during
pruning we can get the filter columns and prune the related datamap.
Please don't refer the PR 1399 yet as it is still incomplete and many
things will change in it.
We are again updating the DataMap interfaces to support FG for storing &
retrieving rowid tuples from datamap.Soon will add proper example for the
same.

Regards,
Ravindra.

On 9 October 2017 at 23:22, Dong Xie  wrote:

> Hi,
>
> Datamap API currently miss an important input parameter `IndexColumns`. It
> is common that we only want to implement one type of DataMap but can apply
> to different data and different column set. In PR 1399, there are no
> specified index columns. I think it would be nice to include that in the
> API.
>
> Thanks,
> Dong




-- 
Thanks & Regards,
Ravi


DataMap Interface requires `IndexColumns` as Input

2017-10-09 Thread Dong Xie
Hi,

Datamap API currently miss an important input parameter `IndexColumns`. It is 
common that we only want to implement one type of DataMap but can apply to 
different data and different column set. In PR 1399, there are no specified 
index columns. I think it would be nice to include that in the API.

Thanks,
Dong

Re: [DISCUSSION] Support only spark 2 in carbon 1.3.0

2017-10-09 Thread Suprith T Jain
+1

On 09-Oct-2017 7:27 AM, "Lu Cao"  wrote:

> Hi community,
> Currently we have three spark related module in carbondata(spark 1.5, 1.6,
> 2.1), the project has become more and more difficult to maintain and has
> many redundant code.
> I propose to stop supporting spark 1.5 &1.6 and focus on spark 2.1(2.2).
> That will keep the project clean and simple for maintenance.
> Maybe we can provide some key patch to old version. But new features could
> support spark2 only.
> Any ideas?
>
>
> Thanks & Regards,
> Lionel Cao
>


Re: 回复:[DISCUSSION] Support only spark 2 in carbon 1.3.0

2017-10-09 Thread Sandeep Purohit
+1

On Mon, Oct 9, 2017 at 1:35 PM, Kunal Kapoor 
wrote:

> +1
>
> On 09-Oct-2017 9:32 AM, "岑玉海"  wrote:
>
> > +1
> >
> >
> > Best regards!
> > Yuhai Cen
> >
> >
> > 在2017年10月9日 09:56,Lu Cao 写道:
> > Hi community,
> > Currently we have three spark related module in carbondata(spark 1.5,
> 1.6,
> > 2.1), the project has become more and more difficult to maintain and has
> > many redundant code.
> > I propose to stop supporting spark 1.5 &1.6 and focus on spark 2.1(2.2).
> > That will keep the project clean and simple for maintenance.
> > Maybe we can provide some key patch to old version. But new features
> could
> > support spark2 only.
> > Any ideas?
> >
> >
> > Thanks & Regards,
> > Lionel Cao
> >
>


Re: 回复:[DISCUSSION] Support only spark 2 in carbon 1.3.0

2017-10-09 Thread Kunal Kapoor
+1

On 09-Oct-2017 9:32 AM, "岑玉海"  wrote:

> +1
>
>
> Best regards!
> Yuhai Cen
>
>
> 在2017年10月9日 09:56,Lu Cao 写道:
> Hi community,
> Currently we have three spark related module in carbondata(spark 1.5, 1.6,
> 2.1), the project has become more and more difficult to maintain and has
> many redundant code.
> I propose to stop supporting spark 1.5 &1.6 and focus on spark 2.1(2.2).
> That will keep the project clean and simple for maintenance.
> Maybe we can provide some key patch to old version. But new features could
> support spark2 only.
> Any ideas?
>
>
> Thanks & Regards,
> Lionel Cao
>


Re: Cause of Compaction?

2017-10-09 Thread Rahul Kumar
Hi Sunerhan,

Can you give some more information like :

1. Your table schema
2. Your Update query.
3. And some more logs.

  Thanks and Regards

*   Rahul Kumar *



On Mon, Oct 9, 2017 at 3:35 PM, sunerhan1...@sina.com  wrote:

> hello,
>
> My application has running for a long time,constantly update and insert
> table.
>
> I got an strange exception  like following:
>
> ERROR command.ProjectForUpdateCommand$: main Update operation passed.
> Exception in Horizontal Compaction. Please check logs.org.apache.spark.sql.
> execution.command.HorizontalCompactionException: Horizontal Update
> Compaction Failed for [e_carbon.prod_inst_his_c]. Compaction failed. Please
> check logs for more info. Exception in compaction java.lang.Exception
> : Compaction Failure in Merger Rdd.
>
> Can anyone explain what may cuase this exception?
>
>
>
> sunerhan1...@sina.com
>


Cause of Compaction?

2017-10-09 Thread sunerhan1...@sina.com
hello,

My application has running for a long time,constantly update and insert table.

I got an strange exception  like following:

ERROR command.ProjectForUpdateCommand$: main Update operation passed. Exception 
in Horizontal Compaction. Please check 
logs.org.apache.spark.sql.execution.command.HorizontalCompactionException: 
Horizontal Update Compaction Failed for [e_carbon.prod_inst_his_c]. Compaction 
failed. Please check logs for more info. Exception in compaction 
java.lang.Exception  : Compaction Failure in Merger Rdd.

Can anyone explain what may cuase this exception?



sunerhan1...@sina.com


Does index be used when doing "join" operation between a big table and a small table?

2017-10-09 Thread sunerhan1...@sina.com
hello,

I have 2 tables need to do "join" operation by their primary key, the primary 
key of these 2 tables are both type "String".

There are 200 million pieces of data in the big table and only 20 throusand 
pieces of data in the small table.

This join operation is quite slow.
I want to know does index be used when doing "join" operation between a big 
table and a small table?

And how to confirm whether index be used?





sunerhan1...@sina.com