Repository

AEM is built on top of Adobe's CRX platform. CRX is a data storage system specifically designed for content-centric applications. AEM uses this content repository to store all its web content, digital assets, scripts, Java libraries, configuration information and other data. CRX implements the Content Repository API for Java Technology (JCR). This standard defines a data model and application programming interface (that is, a set of commands) for content repositories.


For more information click here


Interview Questions

Note: For more on Apache Jackrabbits watch this video

A content repository, as defined by JCR, combines features of the traditional relational database with those of a conventional file system.

File system-like features supported by JCR include:
  • Hierarchy: Content in a JCR repository can be addressed by path. This is useful when delivering content to the web since most websites are also organized hierarchically.
  • Semi-structured content: JCR can store structured documents, like XML, either as opaque files (as a file system would) or as structures ingested directly into the JCR hierarchy.
  • Access Control and Locking: JCR can restrict access to different parts of the content hierarchy based on policies or ACLs. It also supports locking of content to prevent conflicts.

Unlike Jackrabbit 2, Oak does not index content by default. Custom indexes need to be created when necessary, much like with traditional relational databases. If there is no index for a specific query then the whole repository will be traversed

For more information https://docs.adobe.com/docs/en/aem/6-0/deploy/upgrade/queries-and-indexing.html

David Model:

  • Data First, Structure Later. Maybe.
  • Drive the content hierarchy, don't let it happen.
  • Workspaces are for clone(), merge() and update().
  • Beware of Same Name Siblings.
  • References considered harmful.
  • Files are Files.
  • IDs are evil.
For more information https://docs.adobe.com/docs/en/cq/5-6/howto/model_data.html
CRX 2 CRX 3
CRX 2 is extended from Jackrabbit. CRX3 is extended from Jackrabbit OAK.
JackRabbit is a pure JCR implementation. OAK uses a three tier architecture with NODE STATE MODEL that uses JCR just as a facade
Persistence Manager is used to store data in JackRabbit that allows the content to be written to the persistence layer as a blob. In OAK, Microkernels write data as native structures of the underlying Database used. For eg. Mongo DB data is written as documents
Jackrabbit runs on LUCENE. OAK supports SOLR indexing implicitly
CRX2 datastore is on the filesystem by default. CRX3 supports multiple configurations on DataStore (binary data storage). By default it is implicit

If we are using Tar files as the storage, it tends to grow in size and starts claiming disk space every time when data is created or updated as data in tar files are never overwritten rather it keeps adding new versions. To mitigate the same, AEM has garbage collection mechanism which is known as ‘Tar Compaction’ to remove the unused data and reclaim the disk space.

To perform TAR Compaction, please follow this blog http://www.aemcq5tutorials.com/tutorials/online-offline-tar-compaction-in-aem/

The .bnd file contains extra metadata about the bundle used by the CRXDE build process.

Privileges to access the JCR workspace define to manage nodes