Thank you very much!
I have divided 2 billions data into 4 pieces and loaded in the table 。
The three paramaters carbon.graph.rowset.size、 carbon.sort.size
、carbon.number.of.cores.while.loading may be also effect。
Best regards!
At 2017-03-27 13:53:58, "Liang Chen" wrote:
>Hi
>
>1.Use your
Hi
Please enable vector , it might help limit query.
import org.apache.carbondata.core.util.CarbonProperties
import org.apache.carbondata.core.constants.CarbonCommonConstants
CarbonProperties.getInstance().addProperty(CarbonCommonConstants.ENABLE_VECTOR_READER,
"true")
Regards
Liang
a wrote
>
Hi
1.Use your current test environment (CarbonData 1.0 + Spark1.6), Please
divide 2 billions data into 4 pieces(each is 0.5 billion), load data again.
2.For CarbonData 1.0 + Spark1.6 with kettle for loading data, please
configure the bellow 3 parameters in carbon.properties(note: please copy the
TEST SQL :
高基数随机查询
select * From carbon_table where dt='2017-01-01' and user_id='' limit 100;
高基数随机查询like
select * From carbon_table where dt='2017-01-01' and fo like '%%' limit 100;
低基数随机查询
select * From carbon_table where dt='2017-01-01' and plat='android' and
tv='8400' limit 100
1