----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/73010/#review222185 -----------------------------------------------------------
repository/src/main/java/org/apache/atlas/repository/patches/AtlasPatchManager.java Lines 58 (patched) <https://reviews.apache.org/r/73010/#comment311245> should ReIndexPatch be moved 1st in order? Reindexing (if enabled) should be complete first before other patches? repository/src/main/java/org/apache/atlas/repository/patches/ReIndexPatch.java Lines 37 (patched) <https://reviews.apache.org/r/73010/#comment311243> nit: remove unused imports in line 37/38 repository/src/main/java/org/apache/atlas/repository/patches/ReIndexPatch.java Lines 85 (patched) <https://reviews.apache.org/r/73010/#comment311244> 'vertexIndexNames' and 'edgeIndexNames' can be assigned as static string array here instead of assigning from constructor. - Sarath Subramanian On Nov. 9, 2020, 1:45 p.m., Ashutosh Mestry wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/73010/ > ----------------------------------------------------------- > > (Updated Nov. 9, 2020, 1:45 p.m.) > > > Review request for atlas, Madhan Neethiraj, Nikhil Bonte, Nixon Rodrigues, > and Sarath Subramanian. > > > Bugs: ATLAS-4015 > https://issues.apache.org/jira/browse/ATLAS-4015 > > > Repository: atlas > > > Description > ------- > > **Background** > Please see JIRA. > Re-indexing within Atlas was implemented so far as an external tool. Using > this tool had number of challenges. The biggest being the throughput of the > tool. For a medium sized Atlas repository, the tool could take days to finish. > > The implementation addresses the problems. (See results below.) > > **Approach** > Re-indexing is now implemented as a JAVA_PATCH that is applied only when the > property _atlas.patch.reindex.enabled_ is set to true. > > *Modified* AtlasJanusGraphManagement: New method _reindex_ implements the > re-indexing logic. > *New* _ReIndexPatch_ is a JAVA_PATCH that implements the reindexing logic. > This uses the PC framework to enumerate vertices and edges. The patch > application displays useful log messages indicating progress. > > **Configuration** > _atlas.patch.reindex.enabled=true_ > > > Diffs > ----- > > > graphdb/api/src/main/java/org/apache/atlas/repository/graphdb/AtlasGraphManagement.java > f7d2e273c > > graphdb/janus/src/main/java/org/apache/atlas/repository/graphdb/janus/AtlasJanusGraphManagement.java > 2a2ef92a7 > intg/src/main/java/org/apache/atlas/AtlasConfiguration.java 1c7915859 > > repository/src/main/java/org/apache/atlas/repository/patches/AtlasPatchManager.java > b142a2a4a > > repository/src/main/java/org/apache/atlas/repository/patches/ConcurrentPatchProcessor.java > c6f0e6438 > > repository/src/main/java/org/apache/atlas/repository/patches/ReIndexPatch.java > PRE-CREATION > > > Diff: https://reviews.apache.org/r/73010/diff/1/ > > > Testing > ------- > > **Test Setup** > Start with a known Atlas setup with known data. Ascetain that basic search > yields results. > > Use these CURL commands to delete Solr indexes: > > curl http://<host>:8983/solr/vertex_index/update?commit=true -H > "Content-Type: text/xml" --data-binary > '<delete><query>b2d_t:*</query></delete>' > > curl http://<host>:8983/solr/edge_index/update?commit=true -H "Content-Type: > text/xml" --data-binary '<delete><query>1151_t:*</query></delete>' > > curl > http://ve0128.halxg.cloudera.com:8983/solr/fulltext_index/update?commit=true > -H "Content-Type: text/xml" --data-binary > '<delete><query>14at_t:*</query></delete>' > > This will delete solr indexes. If basic search is performed from within the > web UI, it will not show any results. > > Now set configuration parameter. Restart Atlas. > > Server-side logs will indicate that the patch is run. > > **Volume Testing** > Vertices: ~16M: Duration: ~5 hrs. > Edges: ~122M: ~6 hrs. > > Configuration parameters: > atlas.patch.reindex.enabled=true > atlas.patch.numWorkers=14 > atlas.patch.batchSize=1000 > > Node configuration: > Atlas: Heap size: 6 GB. > Solr: Heap size: 12 GB. > > **PC Build** > https://ci-builds.apache.org/job/Atlas/job/PreCommit-ATLAS-Build-Test/177/ > > > Thanks, > > Ashutosh Mestry > >