[ 
https://issues.apache.org/jira/browse/KYLIN-1834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15396108#comment-15396108
 ] 

Richard Calaba commented on KYLIN-1834:
---------------------------------------

Hmmm, strange - let me try to reproduce and provide the exact cube metadata so 
you can import it and look at it.

Did you try with Kylin 1.5.2.1 or used latest Kylin sources from git ?

> java.lang.IllegalArgumentException: Value not exists! - in Step 4 - Build 
> Dimension Dictionary
> ----------------------------------------------------------------------------------------------
>
>                 Key: KYLIN-1834
>                 URL: https://issues.apache.org/jira/browse/KYLIN-1834
>             Project: Kylin
>          Issue Type: Bug
>    Affects Versions: v1.5.2, v1.5.2.1
>            Reporter: Richard Calaba
>            Priority: Blocker
>         Attachments: job_2016_06_28_09_59_12-value-not-found.zip
>
>
> Getting exception in Step 4 - Build Dimension Dictionary:
> java.lang.IllegalArgumentException: Value not exists!
>       at 
> org.apache.kylin.dimension.Dictionary.getIdFromValueBytes(Dictionary.java:160)
>       at 
> org.apache.kylin.dict.TrieDictionary.getIdFromValueImpl(TrieDictionary.java:158)
>       at 
> org.apache.kylin.dimension.Dictionary.getIdFromValue(Dictionary.java:96)
>       at 
> org.apache.kylin.dimension.Dictionary.getIdFromValue(Dictionary.java:76)
>       at 
> org.apache.kylin.dict.lookup.SnapshotTable.takeSnapshot(SnapshotTable.java:96)
>       at 
> org.apache.kylin.dict.lookup.SnapshotManager.buildSnapshot(SnapshotManager.java:106)
>       at 
> org.apache.kylin.cube.CubeManager.buildSnapshotTable(CubeManager.java:215)
>       at 
> org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:59)
>       at 
> org.apache.kylin.cube.cli.DictionaryGeneratorCLI.processSegment(DictionaryGeneratorCLI.java:42)
>       at 
> org.apache.kylin.engine.mr.steps.CreateDictionaryJob.run(CreateDictionaryJob.java:56)
>       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>       at 
> org.apache.kylin.engine.mr.common.HadoopShellExecutable.doWork(HadoopShellExecutable.java:60)
>       at 
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:114)
>       at 
> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:50)
>       at 
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:114)
>       at 
> org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:124)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>       at java.lang.Thread.run(Thread.java:745)
> result code:2
> The code which generates the exception is:
> org.apache.kylin.dimension.Dictionary.java:
>  /**
>      * A lower level API, return ID integer from raw value bytes. In case of 
> not found 
>      * <p>
>      * - if roundingFlag=0, throw IllegalArgumentException; <br>
>      * - if roundingFlag<0, the closest smaller ID integer if exist; <br>
>      * - if roundingFlag>0, the closest bigger ID integer if exist. <br>
>      * <p>
>      * Bypassing the cache layer, this could be significantly slower than 
> getIdFromValue(T value).
>      * 
>      * @throws IllegalArgumentException
>      *             if value is not found in dictionary and rounding is off;
>      *             or if rounding cannot find a smaller or bigger ID
>      */
>     final public int getIdFromValueBytes(byte[] value, int offset, int len, 
> int roundingFlag) throws IllegalArgumentException {
>         if (isNullByteForm(value, offset, len))
>             return nullId();
>         else {
>             int id = getIdFromValueBytesImpl(value, offset, len, 
> roundingFlag);
>             if (id < 0)
>                 throw new IllegalArgumentException("Value not exists!");
>             return id;
>         }
>     } 
> ==========================================================
> The Cube is big - fact 110 mio rows, the largest dimension (customer) has 10 
> mio rows. I have increased the JVM -Xmx to 16gb and set the 
> kylin.table.snapshot.max_mb=2048 in kylin.properties to make sure the Cube 
> build doesn't fail (previously we were getting exception complaining about 
> the 300MB limit for Dimension dictionary size (req. approx 700MB)).
> ==========================================================
> Before that we were getting exception complaining about the Dictionary 
> encoding problem - "Too high cardinality is not suitable for dictionary -- 
> cardinality: 10873977" - this we resolved by changing the affected 
> dimension/row key Encoding from "dict" to "int; length=8" on the Advanced 
> Settings of the Cube.
> ==========================================================
> We have 2 high-cardinality fields (one from fact table and one from the big 
> dimension (customer - see above). We need to use in distinc_count measure for 
> our calculations. I wonder if this exception Value not found! is somewhat 
> related ??? Those count_distinct measures are defined one with return type 
> "bitmap" (exact precission - only for Int columns) and 2nd with return type 
> "hllc16" (error rate <= 1.22 %)
> ==========================================================
> I am looking for any clues to debug the cause of this error and way how to 
> circumwent this ... 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to