Privacy-assured and Effective Cloud Data Utilization

As the data produced by enterprises and individuals that need to be stored and utilized is rapidly increasing, data owners are motivated to outsource their local complex data management systems into the cloud for its great flexibility and economic savings. To protect data privacy and combat unsolicited accesses in cloud and beyond, sensitive data has to be encrypted before outsourcing; this, however, obsoletes the traditional data utilization service based on plaintext keyword search. Thus, enabling an encrypted cloud data search service with privacy-assurance is of paramount importance. Considering the potentially large number of on-demand data users and huge amount of outsourced data files in the cloud, this problem is particularly challenging, as it is extremely difficult to meet also the requirements of performance, system usability and scalability. This research project aims to explore such a privacy-assured and effective cloud data utilization service with high service-level performance and usability, by investigating the two challenging research tasks: fuzzy keyword search and ranked keyword search over encrypted cloud data.

 

Fuzzy keyword search, opposing to exact keyword match, tolerates minor typos and format inconsistencies in user search request, and greatly enhances system usability and user searching experience. Its challenge lies in the fact that two words similar to each other would no longer be so after one-way cryptographic transformation (for encrypted keyword search). To address the problem, we plan to explore a brand new symbol-based trie-traverse searching approach, in which transformed fuzzy keywords extracted from data files are stored using a multi-way tree structure to support efficient search, while protecting keyword privacy. [2,5]

Ranked keyword search further ensures the file retrieval accuracy and allows the user to find the most/least relevant information efficiently. We explore the statistical measure approach (i.e. relevance score) from information retrieval (IR), and properly hide the scores in an order-preserved manner. The resulting design is expected to facilitate efficient server-side ranking without losing keyword privacy. For practical performance, different system parameters and the corresponding security/efficiency tradeoff are yet to be thoroughly investigated. [1,4]

Another promising research direction we further propose to explore is the secure multi-keywords semantic search, which takes into consideration conjunction of keywords, sequence of keywords, and even the complex natural language semantics to produce highly relevant search results, while maintaining various stringent privacy guarantees.[3]

Publications

  1. Cong Wang, Ning Cao, Kui Ren, and Wenjing Lou, "Enabling Secure and Efficient Ranked Keyword Search over Outsourced Cloud Data," IEEE Transactions on Parallel and Distributed Systems (TPDS), 2011 (A preliminary version of this paper appeared at the 30th International Conference on Distributed Computing Systems (ICDCS'10)).

  2.  

  3. Cong Wang, Kui Ren, Shucheng Yu, and Karthik Mahendra Raje Urs, "Achieving Usable and Privacy-assured Similarity Search over Outsourced Cloud Data", IEEE INFOCOM'12, Orlando, Florida, March 25-30, 2012

  4.  

  5. Ning Cao, Cong Wang, Ming Li, Kui Ren, and Wenjing Lou, "Privacy-Preserving Multi-keyword Ranked Search over Encrypted Cloud Data", The 30th IEEE Conference on Computer Communications (INFOCOM'11), Shanghai, China, April 10-15, 2011.

  6.  

  7. Cong Wang, Ning Cao, Jin Li, Kui Ren, and Wenjing Lou, "Secure Ranked Keyword Search over Encrypted Cloud Data", The 30th International Conference on Distributed Computing Systems (ICDCS'10), Genoa, Italy, June, 21-25, 2010.

  8.  

  9. Jin Li, Qian Wang, CongWang, Ning Cao, Kui Ren, and Wenjing Lou, "Fuzzy Keyword Search over Encrypted Data in Cloud Computing", The 29th IEEE Conference on Computer Communications (INFOCOM'10), mini-conference, San Diego, CA, March 15-19, 2010.

  10.  

    Disclaimer: The papers here are made available for timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders.