[jira] [Commented] (PHOENIX-2995) Write performance severely degrades with large number of views

Samarth Jain (JIRA) Sat, 13 Aug 2016 10:23:34 -0700

    [ 
https://issues.apache.org/jira/browse/PHOENIX-2995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15420001#comment-15420001
 ]


Samarth Jain commented on PHOENIX-2995:
---------------------------------------

[~tdsilva]  - In CreateTableCompiler, is there a reason why you are 
incrementing timestamp by 1? 

{code}
+                                public void addTable(PTable table, long 
resolvedTime) throws SQLException {
+                                    connection.addTable(table, resolvedTime);
                                 }
                             },
-                            connection, tableRef.getTimeStamp());
+                            connection, tableRef.getTimeStamp()+1);
{code}


in TableInfo, is it possible for any of the member variables to be null? 

{code}
+    private static class TableInfo {
+        
+        private boolean isDataTable;
+        private PName hTableName;
+        private TableRef origTableRef;

{code}

If not, I would mark them as non-null and make the variables final too. You can 
then enforce non-nullability by doing Preconditions.checkNotNull() check. This 
would help you get rid of null checks in the equals() and hashCode methods() of 
the class too. 

It is kinda hard to follow what has changed in MutationState since a lot of 
diff is because of change of indentation. Would you mind attaching a patch 
without white space diff (git diff -w) ?

In PhoenixConnection, are these changes really needed? 

{code}
-                         ! Objects.equal(tenantId, table.getTenantId())) );
+                         (table.getTenantId()!=null && ! 
Objects.equal(tenantId, table.getTenantId()))));

-                         ! Objects.equal(tenantId, function.getTenantId()));
+                        (function.getTenantId()!=null && ! 
Objects.equal(tenantId, function.getTenantId())));

{code}
Doesn't objects.equal() do the same check? 
{code}
public static boolean equal(@Nullable Object a, @Nullable Object b) {
    return a == b || (a != null && a.equals(b));
 }
{code}

Is it possible to combine the prune and clone in one step? We are going over 
all the tables when we are cloning the metadata. Might as well prune them at 
the same time? 

{code}
+        this.metaData = metaData.clone();
+        this.metaData.pruneTables(pruner);
+        this.metaData.pruneFunctions(pruner);
{code}

Looks like nothing changed in FormatToBytesWritableMapper.java. Please revert 
it.

As far as the synchronization is concerned, I think it should now be at the 
PMetadataImpl level. Earlier, because we used to clone on every mutation of 
PMetadata, we were guarding changing of the reference by the lock. That lock 
also helped in guarding against the query services from being closed (which 
sets the metadata to null). Without such a lock in place, we often ran into 
NPEs when creating a new phoenix connection or doing any other operation with 
the metadata instance.

One easy way to do this would be to add synchronized on all the public methods 
of PMetadataImpl. We can discuss it further on Monday.


> Write performance severely degrades with large number of views 
> ---------------------------------------------------------------
>
>                 Key: PHOENIX-2995
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2995
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: Mujtaba Chohan
>            Assignee: Thomas D'Silva
>              Labels: Argus
>             Fix For: 4.8.1
>
>         Attachments: PHOENIX-2995-v2.patch, PHOENIX-2995-v3.patch, 
> PHOENIX-2995.patch, create_view_and_upsert.png, image.png, image2.png, 
> image3.png, upsert_rate.png
>
>
> Write performance for each 1K batch degrades significantly when there are 
> *10K* views being written in random with default 
> {{phoenix.client.maxMetaDataCacheSize}}. With all views created, upsert rate 
> remains around 25 seconds per 1K batch i.e. ~2K rows/min upsert rate. 
> When {{phoenix.client.maxMetaDataCacheSize}} is increased to 100MB+ then view 
> does not need to get re-resolved and upsert rate gets back to normal ~60K 
> rows/min.
> With *100K* views and {{phoenix.client.maxMetaDataCacheSize}} set to 1GB, I 
> wasn't able create all 100K views as upsert time for each 1K batch keeps on 
> steadily increasing. 
> Following graph shows 1K batch upsert rate over time with variation of number 
> of views. Rows are upserted to random views {{CREATE VIEW IF NOT EXISTS ... 
> APPEND_ONLY_SCHEMA = true, UPDATE_CACHE_FREQUENCY=900000}} is executed before 
> upsert statement.
> !upsert_rate.png!
> Base table is also created with {{APPEND_ONLY_SCHEMA = true, 
> UPDATE_CACHE_FREQUENCY = 900000, AUTO_PARTITION_SEQ}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (PHOENIX-2995) Write performance severely degrades with large number of views

Reply via email to