Your timing should be the same as the one in the log. SVDSolve logs the time of 
all relevant operations. I suggest doing a step-by-step execution in a debugger 
to see where those 1000 seconds are spent.
Jose


> El 17 nov 2020, a las 9:05, Rakesh Halder <[email protected]> escribió:
> 
> When building the matrix, I use SVDGetSingularTriplet to get the left 
> singular vectors for each singular value I want, and use VecGetArray to get 
> the address and insert the values in a preallocated matrix I created to store 
> the results in. I’m wondering if this is the best approach in doing so.
> 
>  I also mentioned earlier that in my code I calculated the time before and 
> after calling SVDSolve and found that the elapsed time was around 1000 
> seconds, even though the log gave me 75 seconds. Could there be some issues 
> with creating some of the internal data structures within the SVD object?
> 
> On Tue, Nov 17, 2020 at 2:43 AM Jose E. Roman <[email protected]> wrote:
> What I meant is to send the output of -log_view without any xml formatting. 
> Anyway, as you said the call to the SVD solver takes 75 seconds. The rest of 
> the time should be attributed to your code I guess. Or maybe for not using 
> preallocation if you are building the matrix in AIJ format.
> 
> Jose
> 
> 
> > El 17 nov 2020, a las 8:31, Rakesh Halder <[email protected]> escribió:
> > 
> > And this output is from the small matrix log: 
> > 
> > <?xml version="1.0" encoding="UTF-8"?>
> > <?xml-stylesheet type="text/xsl" href="performance_xml2html.xsl"?>
> > <root>
> > <!-- PETSc Performance Summary: -->
> >   <petscroot>
> >     <runspecification desc="Run Specification">
> >       <executable desc="Executable">simpleROMFoam</executable>
> >       <architecture desc="Architecture">real-opt</architecture>
> >       <hostname desc="Host">pmultigrid</hostname>
> >       <nprocesses desc="Number of processes">1</nprocesses>
> >       <user desc="Run by user">rhalder</user>
> >       <date desc="Started at">Mon Nov 16 20:40:01 2020</date>
> >       <petscrelease desc="Petsc Release">Petsc Release Version 3.14.1, Nov 
> > 03, 2020 </petscrelease>
> >     </runspecification>
> >     <globalperformance desc="Global performance">
> >       <time desc="Time (sec)">
> >         <max>2.030551e+02</max>
> >         <maxrank desc="rank at which max was found">0</maxrank>
> >         <ratio>1.000000</ratio>
> >         <average>2.030551e+02</average>
> >       </time>
> >       <objects desc="Objects">
> >         <max>5.300000e+01</max>
> >         <maxrank desc="rank at which max was found">0</maxrank>
> >         <ratio>1.000000</ratio>
> >         <average>5.300000e+01</average>
> >       </objects>
> >       <mflop desc="MFlop">
> >         <max>0.000000e+00</max>
> >         <maxrank desc="rank at which max was found">0</maxrank>
> >         <ratio>0.000000</ratio>
> >         <average>0.000000e+00</average>
> >         <total>0.000000e+00</total>
> >       </mflop>
> >       <mflops desc="MFlop/sec">
> >         <max>0.000000e+00</max>
> >         <maxrank desc="rank at which max was found">0</maxrank>
> >         <ratio>0.000000</ratio>
> >         <average>0.000000e+00</average>
> >         <total>0.000000e+00</total>
> >       </mflops>
> >       <messagetransfers desc="MPI Message Transfers">
> >         <max>0.000000e+00</max>
> >         <maxrank desc="rank at which max was found">0</maxrank>
> >         <ratio>0.000000</ratio>
> >         <average>0.000000e+00</average>
> >         <total>0.000000e+00</total>
> >       </messagetransfers>
> >       <messagevolume desc="MPI Message Volume (MiB)">
> >         <max>0.000000e+00</max>
> >         <maxrank desc="rank at which max was found">0</maxrank>
> >         <ratio>0.000000</ratio>
> >         <average>0.000000e+00</average>
> >         <total>0.000000e+00</total>
> >       </messagevolume>
> >       <reductions desc="MPI Reductions">
> >         <max>0.000000e+00</max>
> >         <maxrank desc="rank at which max was found">0</maxrank>
> >         <ratio>0.000000</ratio>
> >       </reductions>
> >     </globalperformance>
> >     <timertree desc="Timings tree">
> >       <totaltime>203.055134</totaltime>
> >       <timethreshold>0.010000</timethreshold>
> >       <event>
> >         <name>MatConvert</name>
> >         <time>
> >           <value>0.0297699</value>
> >         </time>
> >         <events>
> >           <event>
> >             <name>self</name>
> >             <time>
> >               <value>0.029759</value>
> >             </time>
> >           </event>
> >         </events>
> >       </event>
> >       <event>
> >         <name>SVDSolve</name>
> >         <time>
> >           <value>0.0242731</value>
> >         </time>
> >         <events>
> >           <event>
> >             <name>self</name>
> >             <time>
> >               <value>0.0181869</value>
> >             </time>
> >           </event>
> >         </events>
> >       </event>
> >       <event>
> >         <name>MatView</name>
> >         <time>
> >           <value>0.0138235</value>
> >         </time>
> >       </event>
> >     </timertree>
> >     <selftimertable desc="Self-timings">
> >       <totaltime>203.055134</totaltime>
> >       <event>
> >         <name>MatConvert</name>
> >         <time>
> >           <value>0.0324545</value>
> >         </time>
> >       </event>
> >       <event>
> >         <name>SVDSolve</name>
> >         <time>
> >           <value>0.0181869</value>
> >         </time>
> >       </event>
> >       <event>
> >         <name>MatView</name>
> >         <time>
> >           <value>0.0138235</value>
> >         </time>
> >       </event>
> >     </selftimertable>
> >   </petscroot>
> > </root>
> > 
> > 
> > On Tue, Nov 17, 2020 at 2:30 AM Rakesh Halder <[email protected]> wrote:
> > The following is from the large matrix log: 
> > 
> > <?xml version="1.0" encoding="UTF-8"?>
> > <?xml-stylesheet type="text/xsl" href="performance_xml2html.xsl"?>
> > <root>
> > <!-- PETSc Performance Summary: -->
> >   <petscroot>
> >     <runspecification desc="Run Specification">
> >       <executable desc="Executable">simpleROMFoam</executable>
> >       <architecture desc="Architecture">real-opt</architecture>
> >       <hostname desc="Host">pmultigrid</hostname>
> >       <nprocesses desc="Number of processes">1</nprocesses>
> >       <user desc="Run by user">rhalder</user>
> >       <date desc="Started at">Mon Nov 16 20:25:52 2020</date>
> >       <petscrelease desc="Petsc Release">Petsc Release Version 3.14.1, Nov 
> > 03, 2020 </petscrelease>
> >     </runspecification>
> >     <globalperformance desc="Global performance">
> >       <time desc="Time (sec)">
> >         <max>1.299397e+03</max>
> >         <maxrank desc="rank at which max was found">0</maxrank>
> >         <ratio>1.000000</ratio>
> >         <average>1.299397e+03</average>
> >       </time>
> >       <objects desc="Objects">
> >         <max>9.100000e+01</max>
> >         <maxrank desc="rank at which max was found">0</maxrank>
> >         <ratio>1.000000</ratio>
> >         <average>9.100000e+01</average>
> >       </objects>
> >       <mflop desc="MFlop">
> >         <max>0.000000e+00</max>
> >         <maxrank desc="rank at which max was found">0</maxrank>
> >         <ratio>0.000000</ratio>
> >         <average>0.000000e+00</average>
> >         <total>0.000000e+00</total>
> >       </mflop>
> >       <mflops desc="MFlop/sec">
> >         <max>0.000000e+00</max>
> >         <maxrank desc="rank at which max was found">0</maxrank>
> >         <ratio>0.000000</ratio>
> >         <average>0.000000e+00</average>
> >         <total>0.000000e+00</total>
> >       </mflops>
> >       <messagetransfers desc="MPI Message Transfers">
> >         <max>0.000000e+00</max>
> >         <maxrank desc="rank at which max was found">0</maxrank>
> >         <ratio>0.000000</ratio>
> >         <average>0.000000e+00</average>
> >         <total>0.000000e+00</total>
> >       </messagetransfers>
> >       <messagevolume desc="MPI Message Volume (MiB)">
> >         <max>0.000000e+00</max>
> >         <maxrank desc="rank at which max was found">0</maxrank>
> >         <ratio>0.000000</ratio>
> >         <average>0.000000e+00</average>
> >         <total>0.000000e+00</total>
> >       </messagevolume>
> >       <reductions desc="MPI Reductions">
> >         <max>0.000000e+00</max>
> >         <maxrank desc="rank at which max was found">0</maxrank>
> >         <ratio>0.000000</ratio>
> >       </reductions>
> >     </globalperformance>
> >     <timertree desc="Timings tree">
> >       <totaltime>1299.397478</totaltime>
> >       <timethreshold>0.010000</timethreshold>
> >       <event>
> >         <name>SVDSolve</name>
> >         <time>
> >           <value>75.5819</value>
> >         </time>
> >         <events>
> >           <event>
> >             <name>self</name>
> >             <time>
> >               <value>75.3134</value>
> >             </time>
> >           </event>
> >           <event>
> >             <name>MatConvert</name>
> >             <time>
> >               <value>0.165386</value>
> >             </time>
> >             <ncalls>
> >               <value>3.</value>
> >             </ncalls>
> >             <events>
> >               <event>
> >                 <name>self</name>
> >                 <time>
> >                   <value>0.165386</value>
> >                 </time>
> >               </event>
> >             </events>
> >           </event>
> >           <event>
> >             <name>SVDSetUp</name>
> >             <time>
> >               <value>0.102518</value>
> >             </time>
> >             <events>
> >               <event>
> >                 <name>self</name>
> >                 <time>
> >                   <value>0.0601394</value>
> >                 </time>
> >               </event>
> >               <event>
> >                 <name>VecSet</name>
> >                 <time>
> >                   <value>0.0423783</value>
> >                 </time>
> >                 <ncalls>
> >                   <value>4.</value>
> >                 </ncalls>
> >               </event>
> >             </events>
> >           </event>
> >         </events>
> >       </event>
> >       <event>
> >         <name>MatConvert</name>
> >         <time>
> >           <value>0.575872</value>
> >         </time>
> >         <events>
> >           <event>
> >             <name>self</name>
> >             <time>
> >               <value>0.575869</value>
> >             </time>
> >           </event>
> >         </events>
> >       </event>
> >       <event>
> >         <name>MatView</name>
> >         <time>
> >           <value>0.424561</value>
> >         </time>
> >       </event>
> >       <event>
> >         <name>BVCopy</name>
> >         <time>
> >           <value>0.0288127</value>
> >         </time>
> >         <ncalls>
> >           <value>2000.</value>
> >         </ncalls>
> >         <events>
> >           <event>
> >             <name>VecCopy</name>
> >             <time>
> >               <value>0.0284472</value>
> >             </time>
> >           </event>
> >         </events>
> >       </event>
> >       <event>
> >         <name>MatAssemblyEnd</name>
> >         <time>
> >           <value>0.0128941</value>
> >         </time>
> >       </event>
> >     </timertree>
> >     <selftimertable desc="Self-timings">
> >       <totaltime>1299.397478</totaltime>
> >       <event>
> >         <name>SVDSolve</name>
> >         <time>
> >           <value>75.3134</value>
> >         </time>
> >       </event>
> >       <event>
> >         <name>MatConvert</name>
> >         <time>
> >           <value>0.741256</value>
> >         </time>
> >       </event>
> >       <event>
> >         <name>MatView</name>
> >         <time>
> >           <value>0.424561</value>
> >         </time>
> >       </event>
> >       <event>
> >         <name>SVDSetUp</name>
> >         <time>
> >           <value>0.0601394</value>
> >         </time>
> >       </event>
> >       <event>
> >         <name>VecSet</name>
> >         <time>
> >           <value>0.0424012</value>
> >         </time>
> >       </event>
> >       <event>
> >         <name>VecCopy</name>
> >         <time>
> >           <value>0.0284472</value>
> >         </time>
> >       </event>
> >       <event>
> >         <name>MatAssemblyEnd</name>
> >         <time>
> >           <value>0.0128944</value>
> >         </time>
> >       </event>
> >     </selftimertable>
> >   </petscroot>
> > </root>
> > 
> > 
> > On Tue, Nov 17, 2020 at 2:28 AM Jose E. Roman <[email protected]> wrote:
> > I cannot visualize the XML files. Please send the information in plain text.
> > Jose
> > 
> > 
> > > El 17 nov 2020, a las 5:33, Rakesh Halder <[email protected]> escribió:
> > > 
> > > Hi Jose,
> > > 
> > > I attached two XML logs of two different SVD calculations where N ~= 
> > > 140,000; first a small N x 5 matrix, and then a large N x 1000 matrix. 
> > > The global timing starts before the SVD calculations. The small matrix 
> > > calculation happens very quick in total (less than a second), while the 
> > > larger one takes around 1,000 seconds. The "largeMat.xml" file shows that 
> > > SVDSolve takes around 75 seconds, but when I time it myself by outputting 
> > > the time difference to the console, it shows that it takes around 1,000 
> > > seconds, and I'm not sure where this mismatch is coming from.
> > > 
> > > This is using the scaLAPACK SVD solver on a single processor, and I call 
> > > MatConvert to convert my matrix to the MATSCALAPACK format.
> > > 
> > > Thanks,
> > > 
> > > Rakesh
> > > 
> > > On Mon, Nov 16, 2020 at 2:45 AM Jose E. Roman <[email protected]> wrote:
> > > For Cross and TRLanczos, make sure that the matrix is stored in DENSE 
> > > format, not in the default AIJ format. On the other hand, these solvers 
> > > build the transpose matrix explicitly, which is bad for dense matrices in 
> > > parallel. Try using SVDSetImplicitTranspose(), this will also save memory.
> > > 
> > > For SCALAPACK, it is better if the matrix is passed in the MATSCALAPACK 
> > > format already, otherwise the solver must convert it internally. Still, 
> > > the matrix of singular vectors must be converted after computation.
> > > 
> > > In any case, performance questions should include information from 
> > > -log_view so that we have a better idea of what is going on.
> > > 
> > > Jose
> > > 
> > > 
> > > > El 16 nov 2020, a las 6:04, Rakesh Halder <[email protected]> escribió:
> > > > 
> > > > Hi Jose,
> > > > 
> > > > I'm only interested in part of the singular triplets, so those 
> > > > algorithms work for me. I tried using ScaLAPACK and it gives similar 
> > > > performance to Lanczos and Cross, so it's still very slow.... I'm still 
> > > > having memory issues with LAPACK and Elemental is giving me an error 
> > > > message indicating that the operation isn't supported for rectangular 
> > > > matrices. 
> > > > 
> > > > With regards to scaLAPACK or any other solver, I'm wondering if there's 
> > > > some settings to use with the SVD object to ensure optimal performance.
> > > > 
> > > > Thanks,
> > > > 
> > > > Rakesh
> > > > 
> > > > On Sun, Nov 15, 2020 at 2:59 PM Jose E. Roman <[email protected]> 
> > > > wrote:
> > > > Rakesh,
> > > > 
> > > > The solvers you mention are not intended for computing the full SVD, 
> > > > only part of the singular triplets. In the latest version (3.14) there 
> > > > are now solvers that wrap external packages for parallel dense 
> > > > computations: ScaLAPACK and Elemental.
> > > > 
> > > > Jose
> > > > 
> > > > 
> > > > > El 15 nov 2020, a las 20:48, Matthew Knepley <[email protected]> 
> > > > > escribió:
> > > > > 
> > > > > On Sun, Nov 15, 2020 at 2:18 PM Rakesh Halder <[email protected]> 
> > > > > wrote:
> > > > > Hi all,
> > > > > 
> > > > > A program I'm writing involves calculating the SVD of a large, dense 
> > > > > N by n matrix (N ~= 150,000, n ~=10,000). I've used the different SVD 
> > > > > solvers available through SLEPc, including the cross product, 
> > > > > lanczos, and method available through the LAPACK library. The cross 
> > > > > product and lanczos methods take a very long time to compute the SVD 
> > > > > (around 7-8 hours on one processor) while the solver using the LAPACK 
> > > > > library runs out of memory. If I write this matrix to a file and 
> > > > > solve the SVD using MATLAB or python (numPy) it takes around 10 
> > > > > minutes. I'm wondering if there's a much cheaper way to solve the SVD.
> > > > > 
> > > > > This seems suspicious, since I know numpy just calls LAPACK, and I am 
> > > > > fairly sure that Matlab does as well. Do the machines that you
> > > > > are running on have different amounts of RAM?
> > > > > 
> > > > >   Thanks,
> > > > > 
> > > > >      Matt
> > > > >  
> > > > > Thanks,
> > > > > 
> > > > > Rakesh
> > > > > 
> > > > > 
> > > > > -- 
> > > > > What most experimenters take for granted before they begin their 
> > > > > experiments is infinitely more interesting than any results to which 
> > > > > their experiments lead.
> > > > > -- Norbert Wiener
> > > > > 
> > > > > https://www.cse.buffalo.edu/~knepley/
> > > > 
> > > 
> > > <largeMat.xml><smallMat.xml>
> > 
> 

Reply via email to