Publications (selected)

You can also find all my articles on my Google Scholar profile.

Embedded Functional Dependencies and Data-ompleteness Tailored Database Design

Published in ACM Transactions on Database Systems (TODS), 2021

This paper establishes a principled schema design framework for data with missing values. The framework is based on the new notion of an embedded functional dependency, which is independent of the interpretation of missing values, able to express completeness and integrity requirements on application data, and capable of capturing redundant data value occurrences that may cause problems with processing data that meets the requirements.

Recommended citation: Wei, Z., & Link, S. (2019). Embedded functional dependencies and data-completeness tailored database design. https://dl.acm.org/doi/abs/10.1145/3450518

Discovery Algorithms for Embedded Functional Dependencies

Published in International Conference on Management of Data (SIGMOD), 2020

Embedded functional dependencies (eFDs) advance data management applications by data completeness and integrity requirements. We show that the discovery problem of eFDs is NP-complete, W[2]-complete in the output, and has a minimum solution space that is larger than the maximum solution space for functional dependencies. Nevertheless, we use novel data structures and search strategies to develop row-efficient, column-efficient, and hybrid algorithms for eFD discovery.

Recommended citation: Wei, Z., Hartmann, S., & Link, S. (2020, June). Discovery algorithms for embedded functional dependencies. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data (pp. 833-843). https://dl.acm.org/doi/abs/10.1145/3318464.3389786

Discovery and Ranking of Embedded Uniqueness Constraints

Published in Very Large Data Bases Conference (VLDB), 2019

We conduct the first study on the discovery of embedded uniqueness constraints (eUCs). These constraints represents unique column combinations embedded in complete fragments of incomplete data. We showcase their implementation as filtered indexes, and their application in integrity management and query optimization. We show that the decision variant of discovering a minimal eUC is NP-complete and W[2]-complete. We characterize the maximum possible solution size, and show which families of eUCs attain that size.

Recommended citation: Wei, Z., Leck, U., & Link, S. (2019). Discovery and ranking of embedded uniqueness constraints. Proceedings of the VLDB Endowment, 12(13), 2339-2352. https://dl.acm.org/doi/abs/10.14778/3358701.3358703

Discovery and Ranking of Functional Dependencies

Published in IEEE International Conference on Data Engineering (ICDE), 2019

Utilizing new data structures and original techniques for the dynamic computation of stripped partitions, this paper introduces a new hybridization strategy that leverages memory for compuational efficiency.

Recommended citation: Wei, Z., & Link, S. (2019, April). Discovery and ranking of functional dependencies. In 2019 IEEE 35th International Conference on Data Engineering (ICDE) (pp. 1526-1537). IEEE. https://ieeexplore.ieee.org/abstract/document/8731407

DataProf: semantic profiling for iterative data cleansing and business rule acquisition

Published in International Conference on Management of Data (SIGMOD), 2018

We showcase the first semantic data profiler, DataProf. For the constraint class of interest, current profilers compute all constraints that hold on the given data set. DataProf also computes perfect sample records that together satisfy the same constraints as the given data set. Such perfect samples make it easier to spot violations of business rules, which experts can cleanse.

Recommended citation: Wei, Z., & Link, S. (2018, May). DataProf: semantic profiling for iterative data cleansing and business rule acquisition. In Proceedings of the 2018 International Conference on Management of Data (pp. 1793-1796). https://dl.acm.org/doi/abs/10.1145/3183713.3193544

Network, popularity and social cohesion: a game-theoretic approach

Published in AAAI Conference on Artificial Intelligence (AAAI), 2017

We study social cohesion in some special types of graphs and draw a link between social cohesion and a classical notion of structural cohesion by White and Harary. We then focus on the problem of deciding whether a given social network is socially cohesive and show that this problem is CoNP-complete.

Recommended citation: Liu, J., & Wei, Z. (2017, February). Network, popularity and social cohesion: a game-theoretic approach. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 31, No. 1). https://ojs.aaai.org/index.php/AAAI/article/view/10568