Re: [PHP] Large XML manipulation within PHP

2008-04-24 Thread Per Jessen
Steve Gula wrote:

> I work for a company that has chosen to use XML (Software AG Tamino
> XML database) as its storage system for an enterprise application. We
> need to make a system wide change to information within the database
> that isn't feasible to do through our application's user interface. My
> solution was to unload the XML collection in question, open it,
> manipulate it, then write it back out. Problem is it's a 230+MB file
> and even with PHP's max mem set to 4096MB (of 8GB available to the
> system) SimpleXML claims to still run out of memory. Can anyone
> recommend a better way for handling a large amount of XML data?

xalan.


/Per Jessen, Zürich


--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Large XML manipulation within PHP

2008-04-23 Thread Bojan Tesanovic
In that case you may want to try XMLReader as it doesn't load all XML  
into memory.


If that doesn't help that you will need to do custom parser  
application for you need.
using XMLReader to read through whole XML chunking it with eg every  
5000 items and storing those chunks on disk.


Than use SimpleXML  to read and manipulate those chunks and save them  
back to disk.


It would help if you can provide with XML mockup
eg.

 
  ...


  ...


  ...






makeChunksWithXmlReader($pathToLargeXmlFile, CustomXmlManipulator::  
$SPLITAT);



class CustomXmlManipulator{
static $SPLITAT = 5000;


   function getXmlChunk($id){   
  return simplexml_load_file( $this-> getXmlFile($id) );
   }

  function storeXml($id,$simpleXmlObject){
 $file = $this-> getXmlFile($id);
 file_put_contents( $file , $simpleXmlObject->asXml() );
//free up the memory
$simpleXmlObject = null;
  }

 function getXmlFile($id){
 $chunk =  (int)($id / self::$SPLITAT)  + 1;
 return 'xml-' . $chunk .' .xml';
 }
}


$XMLM = new CustomXmlManipulator();
$first =  $XMLM-> getXmlChunk(1);

foreach ($first as $x){
   
.
   if(something){
  //here you need to manipulate ID 23493
  $tmpX = $XMLM-> getXmlChunk(23493);
  $tmpX->  = .;  //change XML
 $XMLM->storeXml(23493, $tmpX);
}
}

?>


this is just a basic logic it can be extender further more, depending  
on your needs.
function  makeChunksWithXmlReader  needs to go through a XML file   
and make chunks on disk.

more on XMLReader http://www.php.net/manual/en/class.xmlreader.php





On Apr 23, 2008, at 10:41 PM, Steve Gula wrote:

I could but it would make things very difficult. Some of the  
entities around
id # 100 could be affected by entities around id #11000 and would  
result in
a file needing to be manipulated at the same time. Unfortunately, I  
don't

think this is a top to bottom change for the information at hand.

On Wed, Apr 23, 2008 at 4:36 PM, Bastien Koert <[EMAIL PROTECTED]>  
wrote:





On 4/23/08, Steve Gula <[EMAIL PROTECTED]> wrote:


I work for a company that has chosen to use XML (Software AG  
Tamino XML
database) as its storage system for an enterprise application. We  
need

to
make a system wide change to information within the database that  
isn't
feasible to do through our application's user interface. My  
solution was

to
unload the XML collection in question, open it, manipulate it, then
write it
back out. Problem is it's a 230+MB file and even with PHP's max  
mem set

to
4096MB (of 8GB available to the system) SimpleXML claims to still  
run

out of
memory. Can anyone recommend a better way for handling a large  
amount of

XML
data? Thanks.

--
--Steve Gula

(this email address is used for list communications only, direct  
contact

at
this email address is not guaranteed to be read)



Can you chunk the data in any way, break it into smaller more  
managable

peices?

--

Bastien

Cat, the other other white meat





--
--Steve Gula

(this email address is used for list communications only, direct  
contact at

this email address is not guaranteed to be read)


Bojan Tesanovic
http://www.carster.us/






Re: [PHP] Large XML manipulation within PHP

2008-04-23 Thread @4u

Hi,

How about expat with custom XML handlers? Should work even with an 32 MB 
memory limit. It will just take some time ...


Have fun

Bastien Koert schrieb:

On 4/23/08, Steve Gula <[EMAIL PROTECTED]> wrote:

I work for a company that has chosen to use XML (Software AG Tamino XML
database) as its storage system for an enterprise application. We need to
make a system wide change to information within the database that isn't
feasible to do through our application's user interface. My solution was
to
unload the XML collection in question, open it, manipulate it, then write
it
back out. Problem is it's a 230+MB file and even with PHP's max mem set to
4096MB (of 8GB available to the system) SimpleXML claims to still run out
of
memory. Can anyone recommend a better way for handling a large amount of
XML
data? Thanks.

--
--Steve Gula

(this email address is used for list communications only, direct contact
at
this email address is not guaranteed to be read)



Can you chunk the data in any way, break it into smaller more managable
peices?



--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Large XML manipulation within PHP

2008-04-23 Thread Stut

On 23 Apr 2008, at 21:41, Steve Gula wrote:

I could but it would make things very difficult. Some of the  
entities around
id # 100 could be affected by entities around id #11000 and would  
result in
a file needing to be manipulated at the same time. Unfortunately, I  
don't

think this is a top to bottom change for the information at hand.


Can you not do it with a text processor like sed? That would be a lot  
easier than trying to do it with SimpleXML.


-Stut

--
http://stut.net/

On Wed, Apr 23, 2008 at 4:36 PM, Bastien Koert <[EMAIL PROTECTED]>  
wrote:





On 4/23/08, Steve Gula <[EMAIL PROTECTED]> wrote:


I work for a company that has chosen to use XML (Software AG  
Tamino XML
database) as its storage system for an enterprise application. We  
need

to
make a system wide change to information within the database that  
isn't
feasible to do through our application's user interface. My  
solution was

to
unload the XML collection in question, open it, manipulate it, then
write it
back out. Problem is it's a 230+MB file and even with PHP's max  
mem set

to
4096MB (of 8GB available to the system) SimpleXML claims to still  
run

out of
memory. Can anyone recommend a better way for handling a large  
amount of

XML
data? Thanks.

--
--Steve Gula

(this email address is used for list communications only, direct  
contact

at
this email address is not guaranteed to be read)



Can you chunk the data in any way, break it into smaller more  
managable

peices?

--

Bastien

Cat, the other other white meat





--
--Steve Gula

(this email address is used for list communications only, direct  
contact at

this email address is not guaranteed to be read)



--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Large XML manipulation within PHP

2008-04-23 Thread Steve Gula
I could but it would make things very difficult. Some of the entities around
id # 100 could be affected by entities around id #11000 and would result in
a file needing to be manipulated at the same time. Unfortunately, I don't
think this is a top to bottom change for the information at hand.

On Wed, Apr 23, 2008 at 4:36 PM, Bastien Koert <[EMAIL PROTECTED]> wrote:

>
>
> On 4/23/08, Steve Gula <[EMAIL PROTECTED]> wrote:
> >
> > I work for a company that has chosen to use XML (Software AG Tamino XML
> > database) as its storage system for an enterprise application. We need
> > to
> > make a system wide change to information within the database that isn't
> > feasible to do through our application's user interface. My solution was
> > to
> > unload the XML collection in question, open it, manipulate it, then
> > write it
> > back out. Problem is it's a 230+MB file and even with PHP's max mem set
> > to
> > 4096MB (of 8GB available to the system) SimpleXML claims to still run
> > out of
> > memory. Can anyone recommend a better way for handling a large amount of
> > XML
> > data? Thanks.
> >
> > --
> > --Steve Gula
> >
> > (this email address is used for list communications only, direct contact
> > at
> > this email address is not guaranteed to be read)
> >
>
> Can you chunk the data in any way, break it into smaller more managable
> peices?
>
> --
>
> Bastien
>
> Cat, the other other white meat




-- 
--Steve Gula

(this email address is used for list communications only, direct contact at
this email address is not guaranteed to be read)


Re: [PHP] Large XML manipulation within PHP

2008-04-23 Thread Bastien Koert
On 4/23/08, Steve Gula <[EMAIL PROTECTED]> wrote:
>
> I work for a company that has chosen to use XML (Software AG Tamino XML
> database) as its storage system for an enterprise application. We need to
> make a system wide change to information within the database that isn't
> feasible to do through our application's user interface. My solution was
> to
> unload the XML collection in question, open it, manipulate it, then write
> it
> back out. Problem is it's a 230+MB file and even with PHP's max mem set to
> 4096MB (of 8GB available to the system) SimpleXML claims to still run out
> of
> memory. Can anyone recommend a better way for handling a large amount of
> XML
> data? Thanks.
>
> --
> --Steve Gula
>
> (this email address is used for list communications only, direct contact
> at
> this email address is not guaranteed to be read)
>

Can you chunk the data in any way, break it into smaller more managable
peices?

-- 

Bastien

Cat, the other other white meat


[PHP] Large XML manipulation within PHP

2008-04-23 Thread Steve Gula
I work for a company that has chosen to use XML (Software AG Tamino XML
database) as its storage system for an enterprise application. We need to
make a system wide change to information within the database that isn't
feasible to do through our application's user interface. My solution was to
unload the XML collection in question, open it, manipulate it, then write it
back out. Problem is it's a 230+MB file and even with PHP's max mem set to
4096MB (of 8GB available to the system) SimpleXML claims to still run out of
memory. Can anyone recommend a better way for handling a large amount of XML
data? Thanks.

-- 
--Steve Gula

(this email address is used for list communications only, direct contact at
this email address is not guaranteed to be read)