Thanks! This is great. I¹ll try it out!
On 12/19/07 3:01 PM, "Kieran Kelleher" <[EMAIL PROTECTED]> wrote:
> And BTW, these massive insert operations are usually done in a background
> thread (either autonomous or with LongResponse meta-refreshing page) and IIRC
> (without looking at my code but I usually have one or a pool of OSC's just for
> EOF intensive background threads) I am probably using a different
> ObjectStoreCoordinator to the regular OSC(s) being used for user R-R handling.
> Using the default OSC for a few 100 thousand may affect application
> responsiveness.
>
> On Dec 19, 2007, at 1:32 PM, Kieran Kelleher wrote:
>
>> Do them one at a time using adaptor operations in a loop ( I regularly run
>> some processes that insert 500,000 to 1,000,000 rows using this method and it
>> is pretty fast. I forget what the numbers were, but it is acceptable. The
>> code pattern was copied from ERX's updateRows and ChannelOperation, so the
>> credit goes to Anjo for the channel handling logic. (see below this snippet
>> for primary key handling.....). Anyway, copy these two methods to a static
>> class and try it. Test it with a millisecond timer before you discount it as
>> too slow. For me it is plenty fast enough ...... I forgot how much faster
>> than regular editing context inserting and saving .... maybe 200 times faster
>> or something ... I forget.......... also at the bottom of the email I have
>> pasted in a method from a piece of code that uses these two static utility
>> methods just to give you an idea of usage .... this is code from one of my
>> apps, but it will give you an idea of how to do what you want ........ I
>> colored the snippet where the work happens in red (not sure if mailing list
>> will carry that or not)
>>
>> HTH, Kieran
>>
>> /**
>> * Inserts a raw row. You are responsible to ensure it has a new valid
>> primary key
>> * @see EOAdaptorChannel#insertRow(NSDictionary, EOEntity)
>> * @param ec
>> * @param rawRow
>> * @param entityName
>> * @return 1
>> */
>> public static int insertRawRowForEntity(EOEditingContext ec, NSDictionary
>> rawRow, String entityName){
>> EOEntity entity = entityNamed(entityName);
>> String modelName = entity.model().name();
>>
>> boolean wasOpen = true;
>> EOAdaptorChannel channel = null;
>> int rows = 0;
>> ec.lock();
>> try {
>> EODatabaseContext dbc =
>> EOUtilities.databaseContextForModelNamed(ec, modelName);
>> dbc.lock();
>> try {
>> channel = dbc.availableChannel().adaptorChannel();
>> wasOpen = channel.isOpen();
>> if(!wasOpen) {
>> channel.openChannel();
>> }
>> channel.adaptorContext().beginTransaction();
>> try {
>> channel.insertRow(rawRow, entity);
>> channel.adaptorContext().commitTransaction();
>> } catch(RuntimeException ex) {
>> channel.adaptorContext().rollbackTransaction();
>> throw ex;
>> }
>> } finally {
>> if(!wasOpen) {
>> channel.closeChannel();
>> }
>> dbc.unlock();
>> }
>> } finally {
>> ec.unlock();
>> }
>> // TODO : The JDBCChannel concrete subclass has a rowsProcessedCount,
>> so perhaps use that
>> return 1;
>> }
>>
>>
>>
>> Before starting the batch insert, I grab an array of primary keys based on
>> how many I need and just pull them form the array as I loop thru inserting.
>>
>> /**
>> * @param ec
>> * @param entityName
>> * @return
>> * @see
>> com.webobjects.eoaccess.EOAdaptorChannel#primaryKeysForNewRowsWithEntity(int,
>> EOEntity)
>> */
>> public static NSArray primaryKeysForNewRowsWithEntity(EOEditingContext
>> ec, int count, String entityName) {
>> EOEntity entity = entityNamed(entityName);
>> String modelName = entity.model().name();
>> NSArray primaryKeysArray = null;
>>
>> boolean wasOpen = true;
>> EOAdaptorChannel channel = null;
>> int rows = 0;
>> ec.lock();
>> try {
>> EODatabaseContext dbc =
>> EOUtilities.databaseContextForModelNamed(ec, modelName);
>> dbc.lock();
>> try {
>> channel = dbc.availableChannel().adaptorChannel();
>> wasOpen = channel.isOpen();
>> if(!wasOpen) {
>> channel.openChannel();
>> }
>> channel.adaptorContext().beginTransaction();
>> try {
>> primaryKeysArray =
>> channel.primaryKeysForNewRowsWithEntity(count, entity);
>> channel.adaptorContext().commitTransaction();
>> } catch(RuntimeException ex) {
>> channel.adaptorContext().rollbackTransaction();
>> throw ex;
>> }
>> } finally {
>> if(!wasOpen) {
>> channel.closeChannel();
>> }
>> dbc.unlock();
>> }
>> } finally {
>> ec.unlock();
>> }
>> return primaryKeysArray;
>> }
>>
>>
>>
>> ****** EXAMPLE OF USAGE (lifted from an existing app) **************
>>
>> /**
>> * OK, the EO direction was way too slow. Here is the faster more "bare
>> metal" approach.
>> * First simply deleteRowsDescribedByQualifier.
>> * Then fetch all geoPoints.
>> * Then insertRows
>> * All done with creating EO's only because we need the speed performance
>> * due to having 10's of thousands of rows per each geospec
>> * TODO: Logic to update a radius by comparing to geoPoints and setting
>> all isOverlapExclude to false
>> */
>> public void resetGeoPointsInRadius(){
>> if (isNewObject()) {
>> throw new RuntimeException("We cannot perform raw row operations
>> on related geospatial points for a CTGeographySpec that is not yet saved"
>> + ". The owner object in question is " + ownerObject());
>>
>> } //~ if (isNewObject())
>>
>> StopWatch stopWatch = new StopWatch();
>> stopWatch.start();
>>
>> // First delete all related geopoints
>> int rowsAffected =
>> ERXEOAccessUtilities.deleteRowsDescribedByQualifier(editingContext(),
>> CTGeoSpatialPoint.ENTITY_NAME, geoSpatialPointsQualifier());
>> if (log.isDebugEnabled())
>> log.debug("DELETE rowsAffected = " + rowsAffected);
>> stopWatch.split();
>> if (log.isDebugEnabled())
>> log.debug("split time after deleting old geospatial points = " +
>> stopWatch);
>> stopWatch.unsplit();
>>
>> // Sometimes we cannot get a geocoded centroid, so check this first
>> if (centroidGeoPoint() != null) {
>> // Recreate all related items in radius
>> NSArray geoPointRawRows =
>> GeoPoint.Utilities.pointsInRadius(centroidGeoPoint(), radiusMiles(),
>> geoPointTypeEntityName());
>>
>> // For speed requirements, we insert geospatial points directly
>> as raw rows
>> Object oidGeographySpec = rawPrimaryKey();
>> Number geoPointType = geoPointType();
>> Integer isOverLapExcludeDefault = new Integer(0);
>> // grab primary keys for the new rows
>> NSArray geoSpatialPointPrimaryKeys =
>> WKEOUtils.primaryKeysForNewRowsWithEntity(editingContext(),
>> geoPointRawRows.count(), CTGeoSpatialPoint.ENTITY_NAME);
>>
>> for (int i = 0; i < geoPointRawRows.count(); i++) {
>> NSDictionary currentGeoPointRawRow = (NSDictionary)
>> geoPointRawRows.objectAtIndex(i);
>> NSMutableDictionary geoSpatialRawRow = new
>> NSMutableDictionary();
>> geoSpatialRawRow.takeValueForKey(oidGeographySpec,
>> CTGeoSpatialPoint.KEY_OID_GEOGRAPHY_SPEC);
>>
>> geoSpatialRawRow.takeValueForKey(currentGeoPointRawRow.valueForKey("oid"),
>> CTGeoSpatialPoint.KEY_OID_GEO_POINT);
>> geoSpatialRawRow.takeValueForKey(geoPointType,
>> CTGeoSpatialPoint.KEY_GEO_POINT_TYPE);
>>
>> geoSpatialRawRow.takeValueForKey(currentGeoPointRawRow.valueForKey("distance"
>> ), CTGeoSpatialPoint.KEY_DISTANCE);
>> geoSpatialRawRow.takeValueForKey(isOverLapExcludeDefault,
>> CTGeoSpatialPoint.KEY_IS_OVERLAP_EXCLUDE);
>>
>> NSDictionary pkDict = (NSDictionary)
>> geoSpatialPointPrimaryKeys.objectAtIndex(i);
>> // Merge the pk dict
>> geoSpatialRawRow.addEntriesFromDictionary(pkDict);
>>
>> if (log.isDebugEnabled() && i < 10) {
>> log.debug("geoSpatialRawRow = " + geoSpatialRawRow);
>> }
>> WKEOUtils.insertRawRowForEntity(editingContext(),
>> geoSpatialRawRow, CTGeoSpatialPoint.ENTITY_NAME);
>> }
>>
>> stopWatch.stop();
>> if (log.isDebugEnabled())
>> log.debug("Time to reset program geospec for " + program() +
>> " = " + stopWatch);
>> } else {
>> if (log.isDebugEnabled())
>> log.debug("Could not get centroidGeoPoint for " + this);
>> } //~ if (centroidGeoPoint() != null)
>>
>> }
>>
>>
>>
>>
>> On Dec 19, 2007, at 12:45 PM, Charles Koppelman wrote:
>>
>>> So ERX provides us with a great way to do massive updates
>>> (ERXEOAccessUtilities.updateRowsDescribedByQualifier). Is there a way that
>>> anyone on this list has come up with to do massive INSERTs?
>>>
>>> Looking at the code, it seems that inserts are done exclusively through
>>> insertRow methods, meaning one insert statement per row. However, SQL
>>> supports multiple inserts into the same table like:
>>>
>>> INSERT INTO MY_TABLE
>>> (ID, COLUMN_A, COLUMN_B)
>>> VALUES (1, 'A', 'B'),
>>> (2, 'C', 'D'),
>>> (3, 'E', 'F');
>>>
>>> Has anyone implemented this?
>>>
>>> Charles Koppelman
>>>
>>> Programmer
>>> Youth For Understanding, USA
>>>
>>> _______________________________________________
>>> Do not post admin requests to the list. They will be ignored.
>>> Webobjects-dev mailing list ([email protected])
>>> Help/Unsubscribe/Update your Subscription:
>>> http://lists.apple.com/mailman/options/webobjects-dev/kieran_lists%40mac.com
>>>
>>> This email sent to [EMAIL PROTECTED]
>>>
>>
>
>
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list ([email protected])
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com
This email sent to [EMAIL PROTECTED]