SonixLegend created PHOENIX-2804: ------------------------------------ Summary: Support partition parameter or repartition function for Spark plugin Key: PHOENIX-2804 URL: https://issues.apache.org/jira/browse/PHOENIX-2804 Project: Phoenix Issue Type: Improvement Affects Versions: 4.7.0 Reporter: SonixLegend Fix For: 4.8.0
When I wanna load some hurge data from phoenix to spark dataframes via phoenix spark plugin, and I had set the dataframes storage level was disk only, but if I wanna do join query data between the dataframes, the spark told me "java.lang.IllegalArgumentException: Size exceeds Integer.MAX_VALUE", because the spark read over 2G file per one partition. Can you add the partition parameter or override repartition function for load data via the plugin? Thanks a lot. -- This message was sent by Atlassian JIRA (v6.3.4#6332)