Wednesday, January 20, 2010

Persistence in Java

As I read Java Persistence with Hibernate, compiling few points on what are Object Persistence and Object/Relational Mapping and how Hibernate and JPA specification in EJB 3.0 fit into the role.

·      In an object-oriented application, persistence allows an object to outlive the process that created it.
·        Persistence of data can be thought of in terms of
o       Storage, organization, and retrieval of structured data
o       Concurrency and data integrity
o       Data sharing
·         Object - Relational paradigm mismatch (Problems in the interaction of these layers)
o       Problem of Granularity
§         In relational databases, can’t have the granularity of data that’s possible in Java (in terms of a class having another object reference, an ADT, a basic data type, etc.). UDT (User-defined Datatypes) support is available in relational databases, but there are many limitations of UDT & so not advisable to use them. This leads to less flexible SQL representation upon the object model.
o       Problem of Subtypes
§        Java has type inheritance (subclass and superclass), but a table isn’t a type, so notion of supertables and subtables is questionable. If they do implement it, they don’t follow a standard syntax and usually lead to data integrity problems.
§        SQL databases also lack an obvious way (or at least a standardized way) to represent a polymorphic association. A foreign key constraint refers to exactly one target table; it isn’t straightforward to define a foreign key that refers to multiple tables. We’d have to write a procedural constraint to enforce this kind of integrity rule.
o       Problem of Identity
§        In Java, object identity is established by either comparing memory locations (a == b) or by return value of ‘equals’ method [a.equals(b)]
§         In SQL, identity is established using primary key value.
§         Neither equals() nor == is naturally equivalent to the primary key value.
o       Problems related to Associations
§        Object-oriented languages represent associations using object references; but in the relational world, an association is represented as a foreign key column, with copies of key values.
§        Object references are inherently directional; the association is from one object to the other. They’re pointers. If an association between objects should be navigable in both directions, you must define the association twice, once in each of the associated classes.
§        On the other hand, foreign key associations aren’t by nature directional. Navigation has no meaning for a relational data model because you can create arbitrary data associations with table joins and projection.
§        Java associations can have many-to-many multiplicity. Table associations, on the other hand, are always one-to-many or one-to-one. If you wish to represent a many-to-many association in a relational database, you must introduce a new table, called a link table. This table doesn’t appear anywhere in the domain model.
o       Problem of data navigation
§        In Java, when you access a user’s billing information, you call a User.getBillingDetails().getAccountNumber() or something similar. This is the most natural way to access object-oriented data, and it’s often described as walking the object network.
§         Unfortunately, this isn’t an efficient way to retrieve data from an SQL database. You may have to query the tables in some way like this
select *
from USERS u
left outer join BILLING_DETAILS bd on bd.USER_ID = u.USER_ID
where u.USER_ID = 123
·        Serializations can’t be used for persistent layer. A serialized network of interconnected objects can only be accessed as a whole; it’s impossible to retrieve any data from the stream without deserializing the entire stream. Thus, the resulting byte stream must be considered unsuitable for arbitrary search or aggregation of large datasets. It isn’t even possible to access or update a single object or subset of objects independently. Loading and overwriting an entire object network in each transaction is no option for systems designed to support high concurrency.
·         Object/relational mapping is the automated (and transparent) persistence of objects in a Java application to the tables in a relational database, using metadata that describes the mapping between the objects and the database.
·         An ORM solution consists of the following four pieces:
o       An API for performing basic CRUD operations on objects of persistent classes.
o       A language or API for specifying queries that refers to classes and properties of classes.
o       A facility for specifying mapping metadata
o       A technique for the ORM implementation to interact with transactional objects to perform dirty checking, lazy association       fetching, and other optimization functions.
·        The new EJB 3.0 specification comes in several parts: The first part defines the new EJB programming model for session beans and message-driven beans, the deployment rules, and so on. The second part of the specification deals with persistence exclusively: entities, object/relational mapping metadata, persistence manager interfaces, and the query language. This second part is called Java Persistence API (JPA).
·        Hibernate is an open source ORM service implementation.
·        Hibernate is about 80,000 lines of code, some of which is much more difficult than  typical application code, along with 25,000 lines of unit test code.
·        Hibernate is part of the JBoss Application Server (JBoss AS), an implementation of J2EE 1.4 and Java EE 5.0. A combination of Hibernate Core, Hibernate Annotations, and Hibernate EntityManager forms the persistence engine of this application server

No comments:

Post a Comment